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ABSTRACT 
Various  methods   are   studied   for  the  relocation,    or  movement, 
including  address  mapping,    of    programs   within  a  multiprogrammed  digital 
computer.      The   aim  of   doing   so    is   to   determine  the   best  method   for  use 
in  the    limited   time-shared   computing   system  proposed   for  development 
in   the  Digital   Control   Laboratory  of   the   Naval   Postgraduate   School.       In 
this    light,    the  concepts   of   time-sharing   and  multiprogramming   are  dis- 
cussed,   as    is   the    implementation  of   relocation    in   a  very    large   computer 
obtained   for  the  School's  main  computer   facility.      The   features   and 
requirements   of   the  D.C.L.    are   then   established   and   evaluated.       It    is 
found   for  the   Laboratory  that   complete    job   swapping  will   be  a   fully 
satisfactory  method  of   relocation.      The  time  taken  will   not   be   exces- 
sive,   and  this  method  will   be   the  easiest   to    incorporate    in  the   time- 
sharing  system.      Details  of   a   possible   implementation   are  given   in  an 
appendix  to   the  thesis. 
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1.      Introduction. 

This   thesis   studies   the  problem  of  the  relocation  of   programs 
within  a  mult iprogrammed   digital   computer.      The   objective  of   doing   so 
is   to  determine  the  most   suitable  method  of   relocation  to   be   applied   in 
the   limited  time-sharing   system  proposed   for  development    in  the  Digital 
Control  Laboratory  of   the   Naval  Postgraduate   School. 

The  plan  of  the  thesis  is  to  progress  from  general  consideration  of 
background  matter  to  specific  investigation  of  the  D.C.L.  system  and  its 
requirements.      The  following  subjects    are  treated: 

a)  a  brief   survey  of  the  meaning  and   potentialities   of   the 
end   application,   time-shared  computing; 

b)  consideration  of  the   general  multiprogramming   environ- 
ment  and  the   need  for  program  relocation; 

c)  study  of   a   number  of   techniques  of   relocation; 

d)  investigation  of  the   relocation  method   implemented   in  a 
very   large  time-sharing  computer,    the    IBM     System/360,   Model    67; 

e)  study  of   the  Digital  Control   Laboratory's   requirements 
and   features  of    its   new  SDS  930-centered  computing   system;    and 

f)  recommendation   and   conclusion. 

The  primary  investigative  tool  used  is  comparative  analysis,  as, 
for  example,  of  different  techniques  of  program  relocation.   Expository 
discussion  is  interleaved  with  consideration  of  advantages  and  disadvan- 
tages. 

The  following  section  introduces  time-shared  computing. 


International  Business  Machines  Corporation.   The  meaning  of  all 
abbreviations  used  in  this  thesis  is  given  in  the  table  on  page  9. 
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2.      Time-Shared   Computing. 

A  major    impediment   to   the   full    use  of  digital   computers   has   been 

2 
expressed   as  the   "speed-cost  mismatch"     between  man  and  machine.      Com- 
puters  are  very  fast,    but  expensive.      Men,    relatively,    are   slow,    but 
their  time   is  cheap.      One  result   of   this  mismatch  has   been   a  tendency 
to   hand  the  machines  over   to  the  group  of  computer  professionals   -  pro- 
grammers,   operators,    and  managers    -  who   know  best   how  to   keep  them  busy. 
The   real   users   of  computer   power   -   the   professors,    executives,    colonels, 
and  generals   -   are,    in  the  main,    isolated   from  direct  contact   with  the 
machines.      No  one  would   suggest   that    the  professors   and  colonels   become 
full-time  programmers   or  operators;    but   there   are  many   problems  where 
time   and  meaning   are  critical,   where  the   isolation  of  the  real   users    is 
a  distinct   disadvantage. 

One   answer  to   the   speed-cost   mismatch,    and  to  the  matter  of    letting 
the  real   user  have  direct   contact   with  the  machine,    is  time-shared, 
multiple-access,   on-line^  computing.      Here   a  computing   system   is  designed 
so   that   a   number  of  different,    possibly  distant,    users   have  concurrent, 
real-time  access   to    it.      The    speed  of  the  computer    is   put  to  good   use 
in  moving   between   each  user's   tasks,    solving  them  at    a  rate   which,    in  a 
well-designed  system,    approaches   that   of   human  reaction.      The  expense 
of   operating  the   system  may   now  be   spread  over   its  many  concurrent   users. 
Because   there   are   relatively   natural   programming    languages   available,    and 
because  the  means  of   access   to   the  computer  can   be   an  easily-employed 
device    (teletypewriters   and  CRT  displays   with  typewriter-like   keyboards 

2 
Licklider,    J.C.R.,   Man-Computer   Symbiosis    (IRE  Trans,    on  Human 

Factors    in  Electronics,    March    1960),    p.    7. 

3 
For  the   sake  of   brevity,    all   these   adjectives  will   be    implied 

when,    henceforth,   only   "time-shared"   is  written. 
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are  most   common),   direct   contact  of   users    is   facilitated. 

Time-sharing  offers    several  other   interesting  possibilities.      One 
of   these   is    in  the   area  of   formulative,   or  trial-and-error ,    problem- 
solving.      Since  computers   follow  only  the  steps    for  which  they   have   been 
programmed,    they  have  been  most   useful   heretofore   in  solving  completely 
pre- formulated   problems,    using   pre-determined  procedures.      Now,    with 
time-shared   computing,   the  user  with  a   less-well-understood   problem  has 
the  opportunity  to  sit   at   his   access   terminal   and  to    interact   with  the 
machine  on   an  almost   conversational   basis.      If   there    is   a  solution  to 
his   problem,   he  may  be   able  to    "feel"  his   way  to    it. 

There    is,   with  time-sharing,    an   advantage   in  the  management  of   an 
information  base.      Only  one,    central    file  need   be  maintained,    as    its 
contents  may  be  made   accessible  to   all   authorized   users   at   their  termi- 
nals;   further,    once   a  user   enters   data   into  the   system,    it    is    immediately 
and    identically  available  to  other    intended   subscribers. 

Time-sharing  can  extend  the  power  of   a   large   computer.      This    is    its 
major   superiority  over   a  proliferation  of    independent    small  machines. 
For  the  same  computing  power   and  number  of   users,    a  time-sharing   system 
with  reasonable  communications   costs   appears   to   be   less   expensive   than 
a  system  of    independent  machines. 

Perhaps   the  most    interesting   possibility  of   time-shared   computing 
is   the  concept  of   a    "computer  utility".      That    is,    like  water  or   electric- 
ity,   computing  power  would   be   furnished   from  a  generating  element    (here, 
the  central   computer)   to    locations   where    it   can  be  used,    there   to  be 
"turned  on"    (employed)    when  needed   and    "turned   off"  when  finished. 

The  number   and  vitality  of   current    applications  of  time-sharing 
demonstrate  that   this   means   of   computing   is   quite   practical.      One 
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4 
compilation   lists   40    installations.        Educational    institutions    with  time- 
sharing systems    include    Stanford,   California  at   Berkeley,    and  the  Massa- 
chusetts  Institute   of  Technology.      The    Naval    Postgraduate   School   will 
soon    join  this    group,    not   only  with   its    D.C.L.    system  but   also    through 
new  equipment   being   installed    in  the  central  Computer   Facility.      Com- 
mercial   systems   are   becoming   quite   numerous.      Many   of   these   are   available 
anywhere   there   are  telephones,    for  they  employ  the  telephone   lines   to 
link  remote   terminals   to   the   computer.      Shown   in  Figure   1    is   the  equip- 
ment  used   in  one   such  commercial   time-sharing  system. 

It    is   not   difficult,    finally,   to   imagine  military   applications. 
For  example,    the  possibility  of   formulative   problem-solving  might   be  as 
valuable  to   a  military   research  organization  as   to   a   similar  civilian 
enterprise.      In   a   large   supply  center  or   personnel   directorate,    time- 
sharing's  advantage   in  management   of   a  centralized    information  base  could 
be  useful.      Because  the  remote  user's   terminal  may  typically  be   a   light- 
weight  teletypewriter,    linked  by  radio   or  wire  to   the  computer,   time- 
sharing may  be   feasible  even  on  the  battlefield.      A  tactical   system 
could   serve   as    a  message   processor,   handling  battle  reports,    logistics 
status,    etc.,    and    also   as    a  means    for   rapid   computation   at   diverse    loca- 
tions  of   such  time-consuming  problems   as    aircraft   schedules    and  embark- 
ation tables.      An   added   advantage  here   would  be  that   when  a  using  unit 
was    not    in  action,    and   its    computing  requirement   was   therefore   small, 
an  expensive   computer  would  not   be   idled;    only  a  terminal  would   not   be 
in  use. 


4 
Time-Sharing   System   Scorecard,   No.    4    (Computer  Research  Corpora- 


tion,   Fall    1966). 
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Figure    1.      Configuration  of   a  commercial  time-sharing  system; 
adapted  from  that  of  the  General   Electric  Company 
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3.      Multiprogramming   and   Program  Relocation. 

Multiprogramming   -   Definition 

The  time-shared   computing   just    introduced   implies   a  multiprogramming 
environment.      That    is,    more   than  one  active  user   program  will   simultan- 
eously be   present  within  the   computing   system.      The   processor  must   be 
operated  to   permit  the  execution  of   a  number  of   programs   in  such   a  way 
that   none  of   the  programs   need   be   completed   before   another    is    started 
or    continued. 

Goals 

Multiprogramming  a  computer,  for  whatever  application,  may  be  done 
with  any  of  a  number  of  possible  system  goals  in  mind.   One  of  these 
might  be  termed,  "improvement  in  user  service".   Possible  sub-goals  to 
this  include  reduction  in  turnaround  time  and  an  increase  in  the  number 
of  allowable  concurrent  users.   In  general,  pursuit  of  this  goal  reflects 
an  awareness  of  the  view  that  a  computer  is  properly  the  servant  of  its 
users. 

A  second  goal,  conversely,  aims  at  realizing  the  greatest  possible 
efficiency  in  the  employment  of  the  physical  components  of  the  computing 
system.   It  thus  recognizes  that  computers  are  very  expensive  machines. 
This  goal  stresses  the  achievement  of  economy. 

The  third,  and  final,  multiprogramming  goal  to  be  considered  here 
is  a  variation  of  the  first.   It  can  be  expressed  as  "improvement  in 
service  to  all  users,  but  with  special  emphasis  on  the  needs  of  some". 
That  is,  certain  users  or  user  classes  would  be  favored,  with  the  system 
responding  preferentially  to  their  requests.   Presumably,  these  special 
users  would  be  either  those  whose  needs  so  required  or  those  whose  equip- 
ment permitted  them  to  take  advantage  of  their  favored  position.   An 
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example  of  the  former  might  be  a  hybrid  system  simulator,  who  typically 
requires  a  definite  amount  of  digital  computing  service  at  fixed  time 
intervals.   He  must  have  this  service  at  the  prescribed  times  if  his 
simulation  is  to  function  at  all.   The  latter  could  include  persons  using 
an  on-line  display  console  to  interface  with  the  computer.   The  relative 
problem-solving  power  of  a  good  display,  properly  backed  with  necessary 
software,  compared  to  many  other  input/output  media,  is  So  great  as  to 
probably  warrant  favored  consideration  to  its  users. 

Problems  of  Multiprogramming 

The  designer  of  a  multiprogrammed  computer  must  solve  a  number  of 
rather  special  problems.   In  general,  these  problems  are  either  not 
found,  or  are  experienced  in  much  less  severe  degree,  in  other  forms 
of  computing.   Their  solutions  seem  particularly  critical  in  the  time- 
sharing application,  where  the  human  user,  at  his  terminal,  is  immediate- 
ly awaiting  answers  from  his  programs. 

These  problems  include: 

a)  scheduling  -  the  order  in  which  the  different  programs 
actively  present  in  the  system  will  be  served  must  be  determined; 

b)  input /output  communications  -  messages  to  and  from  system 
users,  programs  and  answers,  must  be  handled; 

c)  memory  allocation  -  programs  must  be  dynamically  assigned 
to  and  within  the  different  levels  of  system  storage; 

d)  security  -  necessary  isolation  must  be  provided  between 
different  users'  programs,  and  between  a  user  and  a  system  program  not 
his  to  use; 

e)  system  monitoring  -  the  complexity  of  multiprogrammed 
operation  often  warrants  special  consideration  for  the  task  of  monitoring 
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system  functioning  and  accounting  for  the  charges  to  each  user;  and 

f)   program  relocation,  which  is  the  subject  of  this  thesis. 
Program  Relocation 

In  general,  in  the  on-line  multiprogramming  implied  here,  it  is  not 
possible  to  process  a  program  through  to  completion  in  one  "turn"  of  the 
system.   That  is,  the  requirement  to  provide  a  sufficiently  brief  response 
time  to  each  user  -  to  receive  his  program  or  to  provide  some  answers  - 
necessitates  the  interruption,  before  completion,  of  all  but  the  brief- 
est processes.   Further,  it  is  disadvantageous  to  try  to  allow  interrup- 
ted programs  to  remain,  unaltered,  in  main  memory.   Such  practice  is 
probably  impossible,  in  view  of  the  unforeseeable  requirements  of  sub- 
sequent users'  programs,  and  to  attempt  it  would  certainly  result  in  a 
severe  limitation  upon  the  number  of  allowable  concurrent  system  users. 

Thus  there  is  a  need  for  the  movement  of  programs  about  the  com- 
puting system  as  their  status  changes.   It  is  this  movement  which  is 
called,  in  general,  program  relocation.   Some  examples  follow.   A  pro- 
gram which  at  one  instant  of  time  resides  in  main  memory  for  purposes  of 
active  computation  may,  a  few  moments  later,  be  placed  for  temporary 
storage  on  a  drum  or  disc  file.   Conversely,  a  routine  stored  on  mag- 
netic tape  may,  at  some  time,  be  called  into  core  for  processing.   Or, 
a  data  area  may  be  required  by  a  running  program  when,  as  the  result  of 
prior  relocation,  the  two  are  in  different  parts  of  main  memory. 

Another  way  to  consider  program  relocation  is  to  realize  that  to 
function  properly,  the  computing  system  must  be  able  to  access  any  pro- 
gram in  the  system  at  any  time.   No  program  can  ever  become  "lost"  to 
the  central  processor  and  its  operating  system.   Thus  program  "movement" 
requires  a  consideration  of  "access"  methods.   In  studying  relocation, 


this  thesis  must  then  investigate  the  addressing  requirements  of  multi- 
programmed  computations.   Addressing  methods,  in  fact,  are  at  the  center 
of  the  topic  of  relocation,  for  they  have  a  direct  and  important  effect 
upon  the  speed  and  efficiency  of  the  computer. 

The  following  section  begins  this  study  by  considering  a  number  of 
relocation  techniques. 
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4.      Techniques   of   Relocation. 

First   Considerations 

System  Goal    and   End   Application.      Some   possible   goals   of   multipro- 
gramming were  previously  given.      The   goal  which   is   chosen  for   a   system 
must   be  kept    in  mind  when  weighing  the  relative  merits   of  different   re- 
location techniques.      The  end   application  of  the  computing   system  may 
also    influence  the  choice  of  relocation  method.      As   indicated   before, 
the  end   application   important  to  this   thesis    is   time-shared   computing. 

Size  of  Main  Memory.      Usually,    in  the  multiple-access,   on-line 
computing   implied  here,    main  memory  will  be   insufficient   in  size  to   hold 
all   active   computations    simultaneously.      This    is    in  contrast   to  the 
special   purpose  Naval   and  Marine  tactical  data   systems;    there,   the  total 
quantity  of   stored  program   is   known,    and  because  of  the  real-time  re- 
quirements  of   the   systems,   main  memory  holds    it    all.      Here,    the   system 
works    under   a  varying  program   load.      Further,    there  will   be   a  need  to 
store   some   programs    in   an   inactive   status.      Thus   considerations  of  over- 
all  economy   suggest   a  main  memory    limited   in  size,    so   that    it    cannot  be 
expected,    in  general,   to  hold   all   processes   at  once. 

Single-Level   Store.      With  the   consequent   use  of   several   different 
storage  media,    there   has   developed   the  concept   of  the   "single-level 
store".        Since  for  a   user,    it  would  be  difficult,    if  not    impossible, 
to  keep  track  of  where  his   program  resides   at    any  moment,    he   is  not   ex- 
pected to   do   so.      Instead,    the  operating  system  records    and  uses  this 
information,   while  the  user  codes    as    if  his   programs  were   always    in  main 
memory.      To   him,    the   system  does    not    appear   to   have    its    actual   hierarchy 

Kilburn,   T. ,    et   al.,   One-Level   Storage   System    (IRE  Trans,    on 
Electronic   Computers,    April    1962),    p.    223. 
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of  different  storage  media,  such  as  core,  disc,  and  tape;  he  sees  it  as 
possessing  one  "single-level  store". 

Address /Location  Map.   It  may  be  desirable,  for  greater  flexibility, 
that  the  system  not  be  required  to  place  a  program  in  the  same  region  of 
main  memory  each  time  it  is  called  up.  To  so  require  would  incur  extra 
overhead  upon  program  exchange  and  would  complicate  the  queueing  of  wait- 
ing processes,  although  it  may  be  justified  for  other  reasons.   Thus  a 
means  is  needed  to  relate   the  addresses  used  by  a  computation  to  the 
physical  locations  in  main  memory  actually  employed  for  storage.   This 
means  may  be  called,  after  Dennis  ,  the  "address/location  map".   It  may 
be  considered  to  effect  a  translation  from  an  address,  or  "name",  space 
to  a  location  space.   This  thesis  will  often  speak  of  program  relocation 
in  terms  of  methods  of  creating  and  maintaining  this  "map". 

Memory  Protection.   As  suggested  previously,  proper  isolation  be- 
tween different  users'  programs,  or  "memory  protection",  must  be  a  part 
of  any  multiprogrammed  computing  system.   It  is  not  difficult  to  think 
of  pertinent  reasons.   In  a  commercial  system,  one  business  user  must 
not  be  permitted  access  to  another  firm's  secrets  stored  in  the  computer. 
In  a  military  application,  classified  information  must  be  protected. 
Memory  protection  is  needed  in  any  situation  because  new  programs,  which 
often  contain  errors,  must  be  prevented  from  interfering  with  other  pro- 
cesses in  the  system.   Because  the  form  of  memory  protection  provided 
is  often  affected  by  the  choice  of  relocation  technique,  memory  protec- 
tion will  be  treated  as  a  secondary  subject  in  the  remainder  of  this 
thesis. 


Dennis,  J.B. ,  Segmentation  and  the  Design  of  Multiprogrammed 
Computer  Systems  (Journal  of  the  ACM,  October  1965),  p.  590. 
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System  Evolution.   One  lesson  learned  by  designers  in  recent  years 
is  that  provision  must  be  made  for  evolution  in  the  design  of  any  com- 
puter system.   Changes  in  technology  and  applications  occur  too  rapidly 
for  this  not  to  be  so.   There  are  numerous  ways  to  provide  for  system 
evolution;  whichever  seem  appropriate,  any  method  of  handling  the  re- 
location of  mult i programmed  computations  must  be  judged  partly  upon  its 
ability  to  evolve. 

Quantitative  Aspects 

There  are  two  very  useful  parameters  which  may  be  measured  in  the 
evaluation  of  a  relocation  technique.   These  parameters  are: 

a)  the  physical  size  of  the  relocation  program; 

b)  the  time  required  to  effect  relocation. 

The  size  of  relocation  coding  is  important  because  it  represents 
a  demand  upon  a  key  system  resource,  storage.   When  a  new  computer 
system  is  being  planned,  consideration  of  the  probable  size  of  the  re- 
location program  may  affect  the  amount  of  storage  specified.   In  an 
existing  system,  the  otherwise  available  amount  of  system  storage  is 
reduced  by  the  size  of  the  relocation  code. 

It  is  reasonable  to  measure  the  size  of  the  relocation  program  in 
terms  of  main  memory  computer  words  or  bytes,  whichever  is  appropriate 
to  the  particular  computer  under  consideration,  since  it  is  in  main 
memory  that  the  code  will  reside  while  being  executed.   The  number  of 
words  or  bytes  required  is  called  here,  N.   In  many  cases,  not  all  of 
the  relocation  program  need  be  in  main  memory  at  all  times.   For  reasons 
of  greater  economy  and  space-saving,  lesser-used  relocation  routines  are 
often  placed  in  other,  cheaper  storage.   Of  course,  the  disadvantage  of 
doing  so  is  the  greater  access  times  to  such  routines.   In  general,  the 
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size  of  the  relocation  program  may  be  expressed  as 

N  =  Nmm  +  N2s  +  N3s  +  •••  C4.1) 

wnere  Nmm'  ^2s'  N3s'  •*■  re^er  to  tne  mai-n  memory,  secondary,  tertiary, 
...  storage  used,  these  amounts  being  specified  in  terms  of  main  memory 
words  or  bytes. 

The  value  of  N  depends   upon  the   features   of  the   instruction   set  of 
the  computer   under  consideration   and   upon  the  relocation  technique  used. 
For  a   smaller   N,    efficient   table-search   and   powerful    input/output    in- 
structions  are  necessary,    as   these    are    found  to   be  the  major   tasks  of 
relocation.      Also,   as   to   be  expected,    the  more   complex  the  relocation 
technique,   the    longer  the   implementing  code.      In  general,    considering 
the  normal   size  of   such  necessary  components   as   a   loader,   many  reloca- 
tion programs   occupy  one  thousand  or  more  computer  words. 

Especially  critical  in  a  time-sharing  application,  where  human  users 
are  waiting  for  answers  at  their  terminals,  is  the  time  taken  for  program 
relocation.  Even  though  a  useful  function  is  being  performed,  relocation 
time  is  all  overhead,  non-productive  in  terms  of  actual  execution  of  user 
programs.      This  time  may   be  considered   in  two   ways: 

a)  the   absolute   amount    required; 

b)  the  relative   effect,    first,    in  terms   of   increased  program 
execution   time  due  to   address   translations,    and   second,    as    a  contribu- 
tion to   total    system  overhead. 

In  the  absolute  measurement,    relocation  time  will   be  denoted  here 

as   T  .      It   will  often  be   convenient   to  measure   T     over  one   user's    assigned 
r  r  ° 

time-slice,   or  quantum,   q.      There   are  two  major   contributions   to   reloca- 
tion  time.      These   are   the   time  required   for   address   mapping,    T    ,    and  the 
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time  taken  for  program  exchange  (the  input  or  output  of  code),  T  .   Thus 

Tr  =  Ta  +  Te  (4.2) 

Further,  it  may  be  seen  that 


T  =  t  Lf  (4.3) 

a    a 

where  t   is  the  time  required  to  translate  one  address,  L  is  the  length 
of  program  under  consideration,  and  f  is  the  fraction  of  program  instruc- 
tions containing  a  memory  address.   T_  will  be  discussed  elsewhere  in 
this  section  of  the  thesis. 

For  relative  measurements,  it  is  possible  to  express  the  fractional 
increase  in  program  execution  time  due  to  address  mapping,  called  here 
Fa,  as 

Fa  =  -^-   f  (4.4) 

m 

where  m   is    the   number  of   memory  cycles    required   to   execute    the   average 
instruction   in   the   computer   under   consideration,    t      is   the   computer's 
memory   cycle   time,    and   t      and   f    are   as   defined   above.      Obviously,    mt 
may   be   replaced  by  the   average    instruction  time   of  the   computer,    if   that 
is   known   directly. 

If    s    is    the  time   within  q   during  which  the   user's    program   is    actually 
executed,    then    (q    -    s    +   T  )    is   the  total  overhead   time   within   a   quantum. 
Fe    is  defined   to    be   the   ratio   of    relocation  overhead   to   this    total   over- 
head.     Then 

T0    +   TQ 
a  e 

F      =  — —  (4.5) 

e  q    -    s    +   T 

1  a 

With  some  relocation  techniques,  T   is  zero  or  is  small  enough  to  be 
negligible.   In  this  situation,  the  above  expression  becomes 
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F_  =  (4.6) 

e     q  -  s 

Any  or  all  of  these  expressions  (4.1) -(4. 6)  may  be  usefully  evalu- 
ated when  comparing  different  methods  of  program  relocation.   Some  will 
be  discussed  further  and  used  in  examples  in  the  remainder  of  this 
section. 

Consequences  of  "No  Relocation" 

Suppose  that  the  address/location  map  consists  of  a  one-for-one 
translation  of  addresses  into  physical  locations;  that  is,  the  addresses 
are  always  the  same  as  the  locations,  and  a  "no  relocation"  (within  main 
memory)  situation  exists.   Each  program  is  assumed  to  have  full  use,  out- 
side the  resident  portion  of  the  operating  system,  of  the  possible 
addresses  in  the  computer.   In  such  a  case,  dumping  of  information  from 
main  memory  is  often  required  when,  during  processing,  one  program  is 
interrupted  and  another  started  or  resumed.   This  is  so  because: 

a)  the  new  program  may  require  more  locations  than  are  left 
free  by  program (s)  now  in  main  memory;  or 

b)  even  if  the  required  quantity  of  locations  is  available, 
there  may  be  duplication  in  the  addresses  (  =  locations  here)  used.   (See 
Figure  2.) 

In  special  cases,  where  the  total  naming  requirements  of  all  pro- 
cesses are  less  than  the  number  of  addresses  available,  this  frequent 
dumping  of  information  may  be  avoided.   Then,  the  different  programs  may 
be  allocated  to  separate  portions  of  memory,  and  relocation  is  never 
necessary.   This  is  the  situation  with  the  previously  mentioned  Naval 
and  Marine  tactical  data  systems.   This  is  not  generally  the  case,  how- 
ever, due  to  changing  total  demand,  in  the  time-shared  computing  dis- 
cussed here. 
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Figure  2.   a)  No  relocation  within  main  memory 
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Figure  3.   Use  of  a  relocating  register 
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"No  relocation"  is  sometimes  known  as  "job  swapping",  although  such 
terminology  is  rather  loose.   Often  only  that  portion  of  previous  job(s) 
necessary  to  provide  sufficient  space  for  the  incoming  program  is 
swapped. 

The  new  program,  when  the  exchange  is  complete,  is  normally  "run 
from  zero",  i.e.  started  at  some  fixed  location  outside  the  resident  por- 
tion of  the  operating  system.   In  such  event,  the  only  main  memory  pro- 
tection required  is  to  ensure  that  the  program  does  not  access  above  its 
upper  bound.   Perhaps  the  fastest  way  to  implement  this  would  be  by 
adding  an  additional  hardware  register.   During  program  exchange,  this 
register  would  be  loaded  with  the  new  process'  upper  bound.   Its  wiring 
would  be  such  that  during  the  running  of  a  user  program,  each  memory 
access  would  produce  an  automatic  "compare"  with  the  register's  contents; 
the  occurrence  of  an  access  violation  would  result  in  a  "no  operation" 
on  the  access  and  a  trap  to  a  designated  location.   The  isolation  pro- 
vided here  between  programs  is  complete,  with  neither  reading  nor  writing 
outside  one's  own  process  permitted. 

To  establish  the  timing  of  "no  relocation",  it  is  first  noted  that 
since  there  is  no  address  mapping, 

t   =  T  =  0 
a    a 

Thus  Tr  =  T  (4.7) 

It    is    useful   to   divide   T      into   two    parts.      The   first,    called   I,    is   de- 
fined  as   the    initialization  or    set-up   time   required   before   the  movement 
out   of,    or    into,   main  memory  of    a   user   program    is    actually    initiated. 
This   time   is   used   to    perform  tasks    such  as    searching  memory  tables    for 
a   user   program's    length,    storage    location,    first   word   address,    and  other 
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quantities  necessary  to  define  the  input/output  operation.   The  second 
part  of  Te  is  the  time  taken  for  the  actual  input/output  transfer,  called 
here  Tem.   Thus,  in  general, 

TQ  =  I  +  TQm 
e        em 

For   a    "complete"  relocation,    i.e.    one   input    and  output  movement,    as   during 

a  quantum, 

T     =    2(1    +   T      )  (4.8) 

e  em 

The  initialization  time  varies,  of  course,  with  such  factors  as 
the  computer's  instruction  features  and  speed,  and  the  number  of  system 
users.   It  is  also  dependent  upon  the  exact  "no  relocation"  technique 
employed.   It  will  be  smallest  for  the  simplest  application,  complete 
job  swapping. 

The  actual  input /output  time,  T   ,  depends  upon  the  characteristics 

of  the  computer  under  consideration.   It  is  most  useful  to  consider  as 

Tem  only  that  input/output  time  not  overlapped  with  other  processor 

functions.   Then 

T   =0 
em 

if  the  transfer  is  made  on  a  path  completely  independent  of  processor 

act  ion, 

T   =  nt  L 
em     m 

where  n  is  the  number  of  non-overlapped  memory  cycles  per  word  or  byte 
transferred,  and  L  is  the  number  of  words  or  bytes  being  transferred, 
if  the  transfer  occurs  on  a  cycle-stealing  channel,  and 

T   --ir 

em    w 
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where  W  is  the  gross  transfer  rate  of  the  secondary  storage  device  used, 
if  the  transfer  halts  all  processor  action  while  it  is  taking  place. 
In  relative  measurements  with  "no  relocation", 

F   =  0 
a 

because  t   is  zero,  and 

2(1  +  T   ) 
v     enr 

K   =  

e      q  -  s 

per  quantum. 

Despite  the  large  amount  of  program  movement  into  and  out  of  main 
memory,  this  method  has  proven  to  be  quite  usable  in  practical  systems. 
It  is  the  technique  employed  by  the  General  Electric  Company's  commercial 
time-sharing  system.   Further,  it  was  used  for  several  years  at  Project 
MAC  of  the  Massachusetts  Institute  of  Technology. 

Use  of  a  Relocating  Register 
An  improvement  in  a  technical  sense  over  "no  relocation"  is  use  of 
a  relocating  register,  which  permits  translation  of  a  contiguous  set  of 
addresses  in  name  space  to  any  contiguous  set  of  physical  memory  loca- 
tions.  During  the  exchange  of  programs  into  and  out  of  main  memory,  the 
new  program  is  stored  in  any  convenient  set  of  locations.   Thus  less 
dumping  is  required  than  with  "no  relocation",  where  the  new  program  is 
always  loaded  starting  at  the  same  location.   The  mapping  is  effected 
during  program  execution  by  adding  the  proper  constant,  stored  in  the 
relocating  register,  to  each  accessed  program  address.   Thus  occupation 
of  a  different  part  of  location  space  during  processing  is  permitted  with 

Saltzer,  J.H. ,  Compatible  Time-Sharing  System  Notes  (Project  MAC, 
Massachusetts  Institute  of  Technology,  1965),  pp.  31-36. 
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no  changes  in  the  addresses  actually  contained  in  a  program.   Again,  the 
fastest  implementation  would  be  to  provide  the  register  and  addition  in 
hardware.   Some  foresight  is  needed  in  the  design  of  the  computer  word, 
however,  for  a  means  must  be  provided  so  that  the  register  does  not 
operate  on  program  instructions  which  do  not  reference  a  memory  address. 
(Figure  3.) 

This  method  can  be  considered  to  be  an  extension  of  the  relocating 
loader  in  non-mult iprogrammed  batch-processing.   There,  address  trans- 
lation occurs  only  on  loading;  there  is  no  provision  for  relocation 
during  execution.   By  contrast,  a  relocating  register  effects  repeated 
translations  during  execution. 

Memory  protection  may  be  provided  here  by  using  "bounds  registers". 
Such  registers  would  contain,  for  the  running  program  at  any  moment, 
the  current  upper  and  lower  physical  memory  locations.   Any  attempt  by 
the  program  to  access  outside  these  bounds,  or  to  alter  the  bounds  regis- 
ters, should  result  in  a  protection  trap.   Again,  the  isolation  between 
different  programs  will  be  complete. 

When  a  relocating  register  is  used,  in  contrast  to  "no  relocation", 
t„  ?   0.   Therefore,  in  general,  F   ^  0  and  T  5*  0.   That  is,  a  finite 

C*  El  €1 

address   mapping  time    is   required,    and   program   execution  time   is   thereby 
increased.      A  reasonable   range   to   be  expected   for   a   hardware-implemented 
ta    is   from  20   to    200   nanoseconds.      This    is   the   time   required   to   sense 
that    relocation   is    needed   and   to    add   the   contents   of   the   relocating 
register  to   the   program-contained   memory   address.      The   effect   upon   pro- 
gram execution  time    is,    by    (4.4),    directly   proportional   to   t      and    in- 
versely   so   to  t    .      Thus   the   effect   of   address  mapping  time    is   greater   in 
a   faster   computer.      For   an   example,    choose,    as    reasonable  values, 
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t      =    100    nanoseconds 
a 


m   =    3 


«■! 


Then,    if   t      is   three  microseconds, 


ta 

F      =   —  f  (4.4) 

a  mt 

m 


0.1        2 


3(3)      3 

=   0.0074  or   about   0.7% 

But    if  t      is   one  microsecond, 
m 

F      =   0.022   or   2.2% 
a 

which  is,  of  course,  three  times  as  great.   Plotted  as  Figure  4  are  the 
variations  of  Fa  with  t   and  with  tm,  as  given  by  (4.4).   The  values  of 
m  and  f  are  chosen  as  above.   It  can  be  seen  that  for  the  ranges  shown 
of  ta  and  tm,  the  maximum  F   is  about  0.05  or  5%.   Such  an  increase  in 
execution  time  may  or  may  not  be  of  consequence  in  a  particular  computer 
system. 

If  some  assumptions  are  made,  it  is  possible  to  make  a  direct  com- 
parison between  the  "no  relocation"  and  relocating  register  methods  in 
terms  of  the  average  time  required.   The  independent  variable  will  be  L, 
the  average  length  of  program  being  relocated.   For  "no  relocation",  it 
is  recalled  that  (with  subscripts  now  added,  for  clarity) 


<Vhr  ■  (Vnr  (*-9> 


in   general,    and 
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Figure  4.      Increase   in  program  execution  time  due  to  address  mapping 
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per  quantum.   It  is  assumed  now  that  the  initialization  required  with 
use  of  a  relocating  register  will  take  three  times  as  long  as  that  with 
"no  relocation".   That  is, 

This  relation  seems  reasonable,  considering  the  greatly  expanded  amount 
of  main  memory  management  which  is  required  when  a  relocating  register 
is  used.   It  is  also  assumed  that 

(T  )_  =  nt  L  (4.12) 

em  NR     m 

That    is,    the  secondary   storage   device  to   be   used   in  relocation   is   con- 
nected to   a  cycle-stealing   input/output   channel   which  permits  partial 
overlap  of   program  exchange  time  with  other   processing.      This   particular 
assumption   is  not   critical;    a  non-overlapped   channel   could   just   as   well 
have  been  used.      (T em)nn  will   normally  be    less   than    (Tem)NR,    because   use 
of   the   relocating  register   allows  more  than  one  complete  program  to   be 
in  main  memory   at    one   time.      In   fact, 

<*em>RR   ■  M    (WNR  <^13> 

where  —  is  the  fractional  portion  of  main  memory  occupied  by  the  average 
M 

user  program.   The  meaning  of  this  expression  is  that  if,  for  example, 
the  average  user  requires  one-eighth  of  main  memory,  then  program  move- 
ment time  with  use  of  a  relocating  register  will  be,  in  sum,  one-eighth 
that  taken  with  "no  relocation".   This  is  so  because,  on  the  average, 
eight  user  programs  can  reside  in  main  memory  at  one  time  when  a  reloca- 
ting register  is  employed,  and  when  control  transfers  from  one  of  these 
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programs   to   another,   T        =  0.      Now,    by    (4.2) 

<Vrr  =    <Ta>RR  +    (Te>RR 
and,    per   quantum,    using    (4.8) 

<Tr>RR  "    <Ta>RR    +    2[lRR    +    <Tem>RR]  (*-!*> 

Substituting    (4.3),    (4.11),    (4.12),    and    (4.13),    and  using   average  values, 
(4.14)    becomes 

_  ntmL2 

(VRR   =   taLf    +    6INR+    2—T-  (4'15) 

Substituting    (4.12)    into    (4.10),    and   again  using   average  values, 

<VnR   =    2fNR   +   2ntmr  ^'16> 

These  expressions,    (4.15)    and    (4.16),    represent    in  comparable   terms   the 
average   times    for  program  relocation  with   a  relocating  register   and  with 
"no  relocation".      To   demonstrate  their  different   variation  with  L,    average 
program   length,    the  following  values    are  chosen: 

t      =    100    nanoseconds 

*-! 

I._  =  500  microseconds 
NR 

n  =  2 

t   =2.5  microseconds 
m 

M  =  8000  words 
These  values  are  reasonable  for  a  smaller,  medium  speed  system.   The 
results  are: 

(VrR  =  6.7(10_5)L  +  3  +  1.25(10~6)L2 

(T  ) =  1  +  10~2  L    (milliseconds) 

r  NR 
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These  two   expressions    are   plotted   as   Figure   5,    for  0<L2S8000.      The   super- 
iority of  the  relocating  register  method    is   clearly   shown   for  all   L 
except: 

a)  L<200,   where   the  greater   initialization  required   with  the 
relocating   register   is   the  dominant  effect; 

b)  L  at  7800,    i.e.    L  close  to  M,    where,    even  with  the   use  of   a 
relocating  register,    an   input/output   transfer   is   necessary  upon  almost 
all   program  exchanges. 

The  disadvantage  of   the   relocating  register  method  of   program  relo- 
cation results   from  the  fact   that   a  contiguous   set   of   addresses    is 
always  mapped   into   a  contiguous    set  of  physical   locations.      This   tends 
to   create  overhead  during  program  exchange  when  suspended  programs    in 
main  memory  must   be  moved  up  or  down  solely  to   provide   a   sufficiently 
long   set  of   free   locations   for  an   incoming  process. 

Nevertheless,   the   relocating  register  has   also   proven  to  be   a   feasi- 

Q 

ble  technique.   It  has  been  successfully  tested  at  Project  MAC.  Its 

relative  simplicity  of  implementation  is  an  attractive  feature.  Where 

program  exchange  is  a  frequent  occurrence,  however,  this  method  appears 
to  contribute  a  significant  amount  of  system  overhead. 

Blocks  and  Pages 
A  method  of  avoiding  the  contiguity  problem  is  to  divide  main  memory 
into  increments,  called  "blocks",  and  to  divide  programs  into  "pages". 
All  blocks  and  pages  are  of  the  same  fixed  length.   Pages  may  also  be 
considered  to  be  divided  further  into  "lines",  a  line  being  actually  a 
word  or  a  byte.   When  translating  an  address,  the  address/location  map 


Corbato,  F. J. ,  System  Requirements  for  Multiple-Access,  Time- 
Shared  Computers  (Project  MAC,  Massachusetts  Institute  of  Technology, 
1964),  pp.  6-7. 
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must  relate  the  referenced  page  to  its  current  physical  block  in  memory. 
There  is  no  requirement  for  contiguous  pages  in  a  program  to  be  located 
in  contiguous  memory  blocks.   Also,  there  is  no  need  for  address  space 
to  be  of  the  same  size  as  location  space  in  main  memory;  the  former  may 
well  be  larger,  as  long  as  the  operating  system  knows  which  pages  are 
in  main  memory  blocks,  and  which  are  in  other  storage,  at  any  time. 

An  added  advantage  of  using  blocks  and  pages  is  that  it  now  becomes 
convenient  to  call  into  main  memory  only  those  parts  -  pages  -  of  a 
program  which  are  currently  active.   Of  course,  an  algorithm  is  needed 
to  judge  activity  and  to  decide  which  pages  to  bring  in  during  program 
exchange.   Thus  relocation  overhead  will  be  further  reduced,  beyond  the 
reduction  offered  by  the  removal  of  the  contiguity  requirement.   It  may 
also  be  observed  that  the  paging  of  a  process  is  not  a  matter  of  concern 
to  the  applications  programmer.   Pages  are  fixed-length  subdivisions 
which  may  occur  at  arbitrary  points  in  a  program;  they  are  not  sub- 
routines.  Paging  is  effected  in  the  operating  system  and  is  invisible 
to  the  general  system  user. 

One  straight-forward  implementation  of  blocks  and  pages  creates  a 
table  for  each  active  program  in  the  operating  system.   This  table  asso- 
ciates each  page  of  a  user  program  with  its  current  physical  block  or 
other  location  in  memory.   The  look-up  is  made  on  the  page  number,  ob- 
tained from  the  referenced  address  in  the  program,  with  the  result  being 
the  location.   There  is  no  translation  of  the  line,  which  is  assumed  to 
occupy  the  same  relative  position  in  both  page  and  block.   (See  Figures 
6  and  7  .  ) 

Memory  protection  may  be  provided  by  the  association  of  one  or  more 
bits  with  each  block  in  the  block-page  table;  these  bits  indicate  the 
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type  of   access  which  the  program  is   to  have   to   this   block.       If  more 
than  one  bit   is   allocated,    several   forms  of   protection,    such   as    "read 
only",    "write  only",    and   "no   access",    can  be   identified. 

As    implemented    above,    the  blocks   and   pages   method   of   relocation 
imposes    a   time   penalty   for    its    use.       Extra  memory   cycles    are   required 
during  processing  to  make  the  table   look-ups   for  the  address   transla- 
tions.     This   is   a  serious  disadvantage,    and  to   reduce   it,    current   com- 
puters  employing  blocks    and  pages   perform  the  mapping  with  hardware.      The 
SDS  940,    for  example,    provides   two   extra  registers  which  contain  block 
locations   applicable  to  the  program   in  execution.      The  wiring   is   such   as 
to   replace   the   upper,   or  page-indicating,    bits   of   a  program-referenced 
address   with  the  appropriate  block   location  before  the   fetch,    store,   or 
branch  specified  takes    place.      Two    larger   computers,   the    IBM  System/360, 
Model   67    and  the   GE   645,    incorporate   an   associative  memory  element  with- 
in the   central  processor.      In  this   element   are   stored,    for  the  running 
program,    the   page-block  combinations  of   a  number  of  high-use   pages    (per- 
haps these  would   be   the  most   recently  referenced  ones).      When  an  address 
is   referenced,    a  fast   parallel   search  of   the  element    is  made;    if   the  page 
is   present  ,    its   block    is   then   immediately  known. 

As  with  the   relocating  register  method,    t      ^  0  when  blocks   and   pages 
are  used.      However,    the   possible  values   of   ta  now  vary  over   a  wider 
range.      If   the   address   mapping   is   performed   in  hardware,    ta   will   be  of 
the  same  magnitudes    as   for  the  relocating  register,    and  Figure  4.      ap- 
plies.     However,    if   programming    is    used   to   translate    addresses,    t      may 
be  many   times   tm.      This,    of   course,    would    lead   to  much  higher  values   of 
Ta   and   Fa.      This  matter   will   be   pursued   further    in    Section   5   of  this 
thesis,    where   an   example    implementation  of   program  relocation  using  com- 
bined  hardware-software   address   mapping    is    presented. 
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Generally   speaking,   the   blocks   and  pages  method   has   been   incorpora- 
ted   in  newer   computers,    in  which  there    are  usually   available    input/output 
channels  which  operate   independently,   during  actual   transfer,    of   proces- 
sor  functioning.      In  this   case 


Tern   =   0 


provided  only  that  there  exist  other  tasks  for  the  processor  to  perform 
while  the  transfer  is  taking  place.   The  only  contribution  to  program 
exchange  time,  then,  is  the  required  initialization  time.   Per  program 
page  used,  this  time  may  be  called  ID.   If  k  is  the  number  of  pages 
needed  during  the  measurement  interval,  then 

(Vbp  =  kIP  W-17) 

Recalling   (4.2)   and    (4.3),    the  total   relocation  time   becomes 

(Tr>BP    =   fcaLf    +   kIp  (4'18) 

This  quantity  may  be   compared  to   the  times    required  with  use  of   "no 
relocation"    (4.8)    and   with  use  of   a  relocating  register   (4.14)   over  the 
same  measurement    interval.      L,    in    (4.18),    is  to   be   interpreted   as   the 
length  of   program  executed  over   the  interval.      While,    in  general,    it 
will  be   related  to   k,    the   number  of  pages   used,    these  two  quantities  may 
not   be   strictly  proportional.      L  may  not    increase   as   rapidly  as   k,    be- 
cause the    use  of  more   pages    in  the   measurement   time  suggests   that    fewer 
instructions   are  being  executed    from  each   page.      Figure   8  plots    (4.18) 
against   k   for  the   following  relations   between  L   and  k: 

L— k 

L  -k2/3 

L~k1/3 
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and   for  the  following  values  of  the  parameters: 

ta   =   100   nanoseconds    (i.e.    hardware  mapping) 
f   -    2 

I     =   500  microseconds 
L  =   2000  words   for  k  =   1 

The   results    show  that    for  these  values,    the    initialization  time,    kl    , 
dominates.      The  situation  would  be  reversed,   with  address  mapping  time 
being  more   important,    if   software  mapping  were   employed.      This    is  be- 
cause ta  would  then  be  many  times    larger. 

A  More  Realistic  Computing  Environment 
Three   features  of   a  realistic   computing  environment   have   not  yet 
been  given  proper  emphasis.      These   features    are  very   large   programs, 
variable-size  data  structures,    and   use  of   common   routines.      All   affect 
program  relocation. 

Very   large   programs  occur  fairly  frequently   in  some   systems.      In 
non-mult iprogrammed   computers,   they   are  handled   by  we  11 -developed   over- 
lay and   chaining  techniques.      In  general,    if   the   address    space  of  the 
system  is    large  enough,    unique  names  may  be   assigned  throughout   a  com- 
putation.     Re-naming   is   then  never   necessary.      Otherwise,    some  of  the 
addresses   used    in   later  portions   of   a  process   will   have  to  be  made  the 
same   as   some  used  earlier.      The   operating   system  must   keep   a  record  of 
any  such  correspondence.      In   a  mult iprogrammed   computer   using   relocation, 
this  re-naming  represents    an  extra,    time-consuming  translation  beyond 
that   required   by  the  normal   address/location  map. 

Examples   of  variable-size  data   structures    in  programs    include 
arrays,    lists,    and  pushdown   stacks.      These  occur  frequently,    and   it    is 
difficult   to   know  their  eventual   size,   particularly   in  the   on-line 
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environment   discussed   here.      This    leads   to   a  dilemma.      If,    in  processing, 
too  many  addresses   are  reserved   for  variable-size   structures,    an   inef- 
ficient   use  of   name  space   results.      On  the  other  hand,   if   too    little 
space   is  reserved,    naming   conflicts   will   arise. 

For  both  very   large    programs    and  variable-size   data   structures, 
the   conclusion   from  the   point   of  view  of   naming  requirements    is   that    it 
would  be  desirable   to  have   an  address   space   sufficiently   large   that, 
in  practice,    it   would  never   be  filled. 

A  third   important   feature  of   a  realistic  computing   environment    is 
the  use  of   common  routines.      It  has   been  suggested  that    in   a  time-sharing 
system,    "common  routines"  means   more   than   just    a    library   collection.      An 
important    facet  of  interactive,   on-line  programming  appears   to  be  the 
frequent   exchange  of   information  between  system  users.      In   some  cases, 
this  exchange   has  occurred  between   two   users   active   at   their  terminals, 

one  entering   some  matter   and   then  transmitting   it,   through  the  system, 

o 
to  the  other. 

It   is  desirable,    for  the   sake  of  efficiency,    to   code   as  many  common 
routines   as   possible    in   "re-entrant"  form.       Puch  routines   may,    by  defi- 
nition,  be   entered  by  a   second  program  before   a   first  has   finished   its 
use.      Ideally,    then,   only  one   copy  of   a  re-entrant   routine   need  be   pre- 
sent  within  the   system,    no  matter   how  many   users   might   call    it.      How   is 
reference  to  be  made  by  programs  to   this   single  copy? 

First,    a   separate    address/location  map  may   be   assigned   to   the   common 
matter,    just   as    if   it   were   an   independent   user   program.      This  method   has 
the  advantage  of   permitting  the   common  routine   use   of   the   full   address 

9 
Fano,   R.M. ,    and   F.J.    Corbato,    Time-Sharing  on  Computers    (Scien- 
tific  American,    September   1966),    p.    140. 
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space  of  the  computer.   However,  it  requires  a  change  of  map  whenever 
the  routine  is  called.   More  importantly,  the  transmittal  of  arguments 
through  two  maps  would  be  rather  complex  and  possibly  time-consuming. 
This  disadvantage  would  compound  when  a  number  of  common  routines  are 
referenced. 

Second,  the  common  routine  may  be  assigned  a  portion  of  the  address 
space  of  the  calling  program.   Then  no  change  of  map  is  needed  when  the 
routine  is  called.   If  address  space  is  sufficiently  large,  the  assign- 
ment does  not  impose  a  significant  restriction  on  the  calling  program. 
However,  this  method  will  work  with  a  single  copy  of  the  common  matter 
only  if  1)  it  is  arbitrarily  relocatable,  or  2)  it  always  occupies  the 
same  portion  of  address  space  in  any  using  program.   Arbitrary  relocata- 
bility  would  impose  an  additional  constraint  on  the  coding,  beyond  that 
of  re-entrancy;  it  also  implies  an  extra,  time-consuming  movement.   By 
requiring  the  movement,  it  really  begs  the  question  of  whether  a  single 
copy  is  being  used.   By  contrast,  use  always  of  the  same  portion  of  ad- 
dress space  appears  to  be  a  serious  restriction.   Yet  if  address  space 
is  large  enough,  this  technique  has  the  virtue  of  simplicity;  a  particu- 
lar common  routine  would  always  be  found  at  the  same  program  addresses. 

Segmentation 
Up  to  this  point,  certain  desirable  features  in  the  addressing 
structure  of  a  mult iprogrammed  computer  have  been  noted: 

a)  address  space  should  be  large  enough  that  unique  addresses 
may  be  assigned  throughout  any  practical  computation; 

b)  data  structures  should  be  expandable  without  necessitating 
a  reallocation  of  addresses;  and 

c)  information  common  to  several  programs  should  have  the 
same  addresses  for  all  programs  that  reference  it. 
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It  should  be  stressed  that  the  total  address  space  of  a  system  need  not 
be  physically  implemented  in  main  memory  storage.   In  fact,  considering 
the  large  addressing  capability  of  some  new  computers,  the  cost  of  such 
implementation  would  be  economically  very  prohibitive.   The  GE  645,  an 
extreme  example,  provides  36-bit  addressing,  or  the  capability  of  speci- 
fying over  68  billion,  words. 

Given  a  sufficiently  large  naming  capability,  the  segmentation  of 
program  addresses  provides  a  suitable  way  to  structure  the  addressing 
scheme.   Physically,  the  resulting  program  segments  are  an  ordered  col- 
lection of  computer  words  with  an  associated  segment  name.   The  number 
of  bits  used  for  the  segment  name  is  chosen  to  permit  as  many  segments 
as  may  be  needed  to  distinguish  different  common  routines,  parts  of 
programs,  etc.  The  number  of  bits  then  left  for  word  addresses  within 
the  segment  should  allow  for  the  largest  collection  of  information  that 
is  to  be  addressed  as  one  ordered  sequence.   In  the  GE  645,  18  bits  are 
employed  for  the  segment  name,  leaving  the  same  number  for  word  addresses; 
218  =  262,144.   (See  Figure  9.) 

Segments  are  used  for  the  allocation  of  address  space,  not  physical 
memory.   But  unlike  pages,  they  are  not  invisible  to  the  programmer.   To 
him,  a  segment  is  any  more  or  less  independent  subdivision  of  a  program. 
It  may  consist  entirely  of  instructions,  entirely  of  data,  or  it  may  be 
a  mix  of  both.   Examples  of  likely  segments  are  main  programs,  common 
subroutines,  and  data  arrays. 

The  segment  may  serve  as  the  basis  of  memory  protection.   An  advan- 
tage here  is  that  address  space,  which  is  unchanging  over  the  life  of  a 


McGee,  W.C. ,  On  Dynamic  Program  Relocation  (IBM  Systems  Journal, 
No.  3,  1965)  ,  p.  188. 
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computation,  is  used.   With,  for  example,  m  programs  employing  a  total 
of  n  segments,  an  m  x  n   matrix  might  be  formed,  its  elements  indicating 
the  type  of  access  which  each  program  is  permitted  to  make  to  each 
segment . 

During  execution,  a  correspondence  is  drawn  between  a  program  address 
identified  by  segment  name  and  word  number,  and  a  physical  memory  loca- 
tion.  This  might  be  done  directly,  but  a  more  flexible  technique  -  in 
that  it  limits  exchange  overhead  while  not  penalizing  long  segments  - 
calls  for  dividing  the  segment  into  pages,  and  main  memory  into  blocks. 
Thus  segmentation  may  be  considered  to  add  a  second  level  of  translation 
to  the  blocks  and  pages  method  of  program  relocation. 

Because  of  the  presence  of  this  second  level,  mapping  times  with 
segmentation  will  normally  be  higher  than  those  which  occur  with  use  of 
other  relocation  methods.   Also,  because  of  the  greater  complexity  of 
the  tables  to  be  searched,  initialization  times  will  tend  to  be  higher. 
In  general,  then,  segmentation  will  not  be  the  fastest  method  of  reloca- 
tion on  a  single-transfer  basis.   Justification  for  its  use  is  based, 
instead,  upon  its  overall  efficiency  in  address  space  allocation,  which 
should,  in  fact,  reduce  the  total  relocation  requirement. 

The  general  nature  of  an  address/location  translation  using  paged 
segments  is  shown  in  Figure  10.   An  example  implementation  of  program 
relocation,  employing  segmentation,  is  discussed  in  the  next  section. 
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5.      Implementation  -  A  Very  Large  Time-Sharing  Computer. 

Why  Considered 

This  section  presents  and  evaluates  the  method  of  program  relocation 
implemented  in  a  particular  very  large,  time-sharing  computer.   The  pur- 
pose of  doing  so  is  to  obtain  ideas  which  may  be  applicable  to  the  Digital 
Control  Laboratory  system.   Although  the  D.C.L.  computer  is  not  "very 
large",  it  should  prove  worthwhile  to  consider  such  a  system  where  the 
problems  of  relocation  are  fully  met.   Possibly,  some  of  its  solutions 
may  then  be  scaled  to  fit  the  D.C.L. 's  requirements. 

The  machine  chosen  for  this  investigation  is  the  IBM  System/360, 
Model  67.  The  reasons  for  selecting  this  particular  computer  are  the 
following: 

a)  It  is  further  developed  than  the  only  other  computer  of 
like  size  and  purpose,  the  GE  645; 

b)  the  pre-eminence  of  IBM  as  a  manufacturer  of  digital  com- 
puters is  based  in  part  upon  technical  excellence,  and  the  Model  67's 
relocation  technique  may  reflect  this  fact;  and 

c)  a  Model  67  is  being  installed  in  the  central  computer 
facility  of  the  Naval  Postgraduate  School,  which  causes  a  natural  in- 
crease in  interest  in  its  design. 

A  note  of  caution  is  in  order.   The  first  System/360,  Model  67  was 
delivered  in  January,  1967.   Completion  of  the  operating  system  for 
time-sharing,  however,  will  not  occur  until  1968.    Thus,  although  the 
relocation  hardware  has  been  delivered,  the  systems  programming  necessary 
to  employ  it  in  a  functioning  system  has  neither  been  fully  developed 

As  announced  in  January,  1967.   This  is  a  slippage  from  an  earlier 
stated  date  of  August,  1967. 
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nor,    more   importantly,    user-tested.      The  discussion  which  follows  must, 

12 
therefore,   be  tentative. 

Address   Translation 

A  Model    67    equipped    for   standard,    24-bit   addressing  offers    an   address 
space,    or    "virtual  memory"    in   IBM's   terms,    of    16,777,216   eight -bit   bytes. 
(All    specifications   by    IBM  of    addressing  or    storage   capabilities    are  made 
in  terms  of  these  bytes.)      The  four  high-order   bits  of   a  program-con- 
tained,   or    "logical",    address    are    interpreted   as    a   segment    number;    the 
next    eight    form  the  page   number,    and   the  final    twelve  give   the    line,    or 
byte.       (Figure    11.)     Thus    address    space    is   divided    into 
16   segments,    each  of   which  contains   up  to 
256   pages   of 

4,096  bytes  each. 
It    is  this   4,096-byte   page  which   is   the  fundamental  quantity  moved  or 
translated   during  program  relocation. 

32-bit    addressing   is   available   as   an  option.      This   provides    a  vir- 
tual memory  of  over  four   billion  bytes.      The   logical   address    is   broken 
into   three    sections  of  twelve,    eight,    and   twelve   bits    specifying  the   seg- 
ment,   page,    and   line,    respectively.      Thus   there   are  4,096   segments 
addressable   with   this   option,    while  the   number  of    pages    in  each,    and   the 
length  of   a  page   are  the   same   as    in  24-bit   addressing. 

It    is    intended   to   use   strict    "demand   paging"   in  the  Model    67.      That 
is,   when    a   program   is   to   begin   or   to   continue   executing,    only    its    current 
page    is    necessarily   brought    into   main  memory.      Further   pages   not    already 

12 

Available  technical  information  on  this  computer  is  limited. 

The  principal  reference  is  System/360  Model  67  Time  Sharing  System  Pre- 
liminary Technical  Summary  (IBM  Form  C20-1647-0,  1966). 
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in  main  memory  will  be  brought  in  only  when  "demanded",  i.e.  referenced. 
This  technique  will,  it  is  expected,  eliminate  overhead  due  to  unneces- 
sary page  movement.   Storage  for  program  pages,  until  they  are  first 
needed,  will  be  on  a  disc;  thereafter,  they  will  be  either  in  core  mem- 
ory, or  as  required  to  free  core  space,  on  a  "swapping"  drum.   Any  over- 
flow from  the  drum  will  be  held  again  on  the  disc.   (Figure  12.)   All 
transfers  between  these  different  levels  of  storage  will  be  completely 
transparent  to  the  user. 

The  entire  address/location  translation  is  implemented  in  hardware 
to  minimize  the  time  required.  There  are  two  special  hardware  features 
which  assist  the  system  in  maintaining  execution  speed.  These  features 
are: 

a)  an  associative  memory  element; 

b)  storage  of  the  instruction  counter  in  relocated  form. 
The  associative  memory  element,  first  mentioned  in  the  preceding 

section  of  this  thesis,  consists  of  eight  registers.   Each  time  a  new 
page  is  referenced  by  a  program,  its  segment  and  page  values,  and  current 
physical  block  location  are  loaded  into  one  of  these  registers.   On  sub- 
sequent program  references  to  virtual  memory,  a  high-speed  parallel 
search  of  the  registers  is  made.   If  the  desired  segment  and  page  number 
are  found,  the  physical  location  information  is  routed  to  replace  that 
which  otherwise  would  have  been  supplied  by  the  segment  and  page  tables. 
With  eight  registers,  the  relocation  of  the  addresses  of  a  program  con- 
taining up  to  32,768  bytes  can  be  performed  entirely  in  the  associative 
memory. 

The  structure  of  each  associative  register  is  shown  in  Figure  13. 
Besides  the  logical  segment  and  page  numbers,  and  physical  block  location, 
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there  are  included  a  "use"  bit,  a  "validity"  bit,  and  four  presently  un- 
used bits.   These  last  provide  a  very  desirable,  built-in  capability  for 
system  evolution.   The  validity  bit  indicates  whether  the  page  named  in 
that  register  is  in  core.   If  zero,  the  page  is  not,  and  any  associative 
comparison  being  made  is  aborted.   In  effect,  this  bit  appears  to  offer 
the  operating  system  a  safety  check,  particularly  valuable  just  after 
program  exchange.   Upon  an  exchange,  all  validity  bits  are  automatically 
set  to  zero;  this  allows  the  operating  system  to  load  only  the  associative 
register  initially  needed  for  the  new  program,  without  being  concerned 
that  a  program  reference  to  a  new  page  might  result,  through  combinations 
still  left  in  other  associative  registers,  in  inadvertent  access  to 
another  user's  program.   Each  use  bit,  also  set  to  zero  during  program 
exchange,  becomes  one  the  first  time  its  register  is  used  during  reloca- 
tion.  When  the  eighth  use  bit  is  set  to  one,  all  of  these  bits  are  re- 
turned to  zero,  and  the  cycle  repeats.   The  operating  system  should 
attempt  to  find  a  register  with  a  zero  use  bit  when  determining  where 
to  load  the  page/block  information  for  a  newly  referenced  program  page. 

Storage  of  the  instruction  counter  in  relocated  form  adds  to  pro- 
gram execution  speed  by  obviating  the  need  for  instruction  address 
translation  until  a  branch  occurs  or  a  page  boundary  is  crossed.   The 
storage  is  accomplished  in  an  extra  register  used  solely  for  this  purpose. 

How  is  a  complete  24-bit  address  translation  performed?   (Figure  14.) 
First,  relocation  must  be  properly  specified  in  the  Program  Status  Word, 
the  64  bits  of  control  information  associated  with  each  active  program 
in  the  system.   Bits  four  and  five  of  the  Word  indicate  the  relocation 
mode  as  follows: 
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Bit   4  Bit    5  Mode 

0  0  No   relocation,    24-bit   addressing 

0  1  Relocation,    24-bit    addressing 

1  0  Invalid   combination  resulting 

in   a   trap 

1  1  Relocation,    32-bit   addressing 

(invalid  combination   if    32 -bit 
addressing  not    installed) 

Then,    when   a   program  references    a   logical    address,    a   match    is    first 
attempted   between   bits    0-11   of   the    logical    address    (the    segment    and    page 
numbers)    and   bits   0-11   of  each   associative   register   having   bit   25    (the 
validity  bit)  set   to   one.      If   a  match   is   found,    bits    12-23  of  the   asso- 
ciative  register   become   bits   0-11   of   the    actual   core   storage   address. 
Bit    24    (the  use   bit)    is    set    to   one    if   not    already   at   that   value. 

If,    however,    there   is   no  match,    the   segment   and  page   tables   stored 
in  core  memory  must    be   used.      All    additions   described    in   the   look-ups   on 
these  tables   are   permanently  wired   for  speed,    reducing   the   reference  time 
for  each  table  to  one  memory  cycle.      There    is   first   a  Table  Register, 
whose  bits   8-31  contain  the  origin  of  the   segment   table   for  the   running 
program.      To   this    origin  are    added   bits   0-3   (the    segment   number)    of   the 
logical    address.      For  this    addition,    these   bits    are    aligned   with   bits 
26-29  of  the   segment   table  origin  since   the  entry   being  found    is    four 
bytes   long.      This   obtains,    held   in  bits    8-31  of  the   result,    the  origin 
of   the   page   table   for   the    indicated   segment.      Added   then  to   this   origin 
are  bits   4-11    (the   page)    of   the    logical   address,    aligned   with   bits    23-30. 
This   finds    a  two-byte   entry    in  the  page   table  consisting  of   a  physical 
block   location  portion    (bits   0-11)    and   control   bits    (12-15).      Bit    12    is 
zero    if   the   referenced    page    is    actually    in  core;    if    so,    bits   0-11    are 
used    as   the   same   bits   of    the   physical    address.       If    bit    12    is  one,    the 
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operating   system  may  be   called    in  order  to   set   up  an   input  operation  to 
bring  the  desired   page    into   core.      This    last    action  would  presumably 
continue    independently  through   an   input -output    processor  while   another 
program  takes  over  the  central   processor.      The  other  control  bits    (13-15) 
are  reserved   for  future  use.      The   translation   is   completed   with  bits 
12-23  of   the    logical    address    forming,    unchanged,    the   same   bits   of   the 
physical   address. 

Address  translation  with  32-bit  addressing  is  different  only  in  that 
the  segment  table  for  each  program  may  be  much  longer,  containing  as  many 
as  4,096,    instead  of   16,    entries. 

Relocation  Timing 
Enough    is   known   about   the  Model    67   to   quantify   the    address   mapping 
portion  of   program  relocation  time.      When  relocation   is   operative,    and 
a  memory  reference  occurs,    the  Model    67' s   clock   is   stopped    for   150  nano- 
seconds  during  the  associative   compare.      If   a  match   is   found,    that   time 
is   the  delay   imposed   by  use  of   address   translation.      That    is,   t      =   150 

EL 

nanoseconds.   If,  however,  the  segment  and  page  tables  must  be  used,  the 
clock  remains  blocked  while  two  accesses  to  the  tables  are  made  and 
while  the  page  entry  found  is  loaded  into  one  of  the  associative  regis- 
ters.  This  action  takes  three  memory  cycles,  or  about  2.1  microseconds. 
Now,  t   =2.25  microseconds.   Obviously,  system  performance  is  greatly 
degraded  when  the  segment  and  page  tables  are  used,  even  if  all  pages 
referenced  are  already  in  core  memory.   How  often  will  use  of  these 
tables  be  necessary  during  execution  of  a  typical  program?  An  IBM 

1  o 

simulation  indicates  that  a  figure  of  5%  will  be  realistic.    That  is, 

13 

'    System/360   Model    67   Time   Sharing   System  Preliminary  Technical 

Summary    (IBM  Form  C20-1647-0,    1966),    p.    56. 
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the  tables  will  be  required  on  5%   of  the  memory  references  made  by  an 
average  program;  for  the  other  95%,  a  match  will  be  found  in  the  asso- 
ciative memory.   Certainly  this  figure  will  vary  with  different  programs 
and  in  different  computing  environments.   Using  it,  however,  the  effect- 
ive, or  weighted  mean,  mapping  time  can  be  computed  to  be 


t   =  0.05(2250)  +  0.95(150)  =  255  nanoseconds 

d 


This  value  is  beyond  the  range  specified  (20-200  nanoseconds)  for  single- 
level  address  translation  schemes.   Further,  it  is  recalled  that 

t. 
mt. 


F  =  —  f  (4.4) 


'm 


For  the  Model  67,  tm  =  750  nanoseconds.   Choosing,  as  suitable  values, 


m  =  3 


**! 

F      -        255 

2 

a           3(750) 

3 

then 

a 

=  0.0756  or  nearly   7.6% 

With  an   F     of  this  magnitude,    increases    in   program  execution  times 
due  to   address   translation  overhead  will   be  noticeable.      From  another 
point   of  view,    consider   a  hypothetical   program  requiring   100   storage 
references,   whose   run  time  on  the  Model    67    in   unrelocated  mode   is    200 
microseconds.      This  might   be   a  typical   short    scientific  calculation, 
quite   active    in  memory   as    it   carries   out    the   programmed   algorithm.       It 
is    assumed  that   the   program    is   entirely    in   core  memory.      With   relocation, 
and    5%  of   the  memory   references   requiring   use  of   the    segment    and    page 
tables,    the   run  time  will    become 

200   +   10o|"o.l5    +   0.05(2.1)1    =   225.5   microseconds 
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This  is  an  increase,  due  to  address  translation  alone,  of  nearly  13%. 
Even  if  the  table  activity  were  zero,  execution  speed  would  be  degraded 
in  this  example  by  7.5%. 

Program  Sectioning 
System  users  must  be  provided  with  programming  aids  which  enable 
them  to  take  advantage  of  the  program  segmentation  implemented  in  the 
hardware.   They  must  be  able  to  conveniently  section  their  programs  into 
logical  units.   There  must  also  be  a  means  of  linking  these  sections, 
including  their  joining  to  those,  such  as  common  routines,  written  by 
others.   In  a  multiprogrammed  computer  where  it  is  intended  that  only  one 
copy  of  a  re-entrant  common  process  be  called  by  all  users,  special  con- 
sideration must  be  given  to  this  linkage. 

Compiler  languages  generally  provide  already  a  means  of  sectioning 
processes.   In  FORTRAN  and  in  ALGOL,  for  example,  program  subdivisions 
are  formed  by  use  of  Subroutine  and  Block  statements,  respectively.   Many 
assemblers  provide  a  similar  capability  through  ORiGin  directives  and 
through  machine  instructions  which  branch.   Examples  of  such  instructions 
include  the  Return  Jump  of  CDC  1604  CODAP  and  the  SDS  900  Series  compu- 
ters ,:  Mark  Place  and  Branch.   The  assembly  language  for  the  System/360, 
Model  67  includes  similar  features;  additionally,  the  programmer's 
general  control  of  sectioning  has  been  expanded,  and  a  means  of  linking 
programs  to  a  re-entrant  common  process  has  been  provided. 

Three  assembler  directives  implement  the  new  sectioning  power  on 
the  Model  67.   These  directives  are: 

CSECT    -   Control  SECTion 

COM     -    COMmon  Control  Section 

PSECT   -   Prototype  Control  SECTion 


A  "control  section"  is  a  block  of  coding  whose  virtual  memory  assignments 
can  be  adjusted,  independently  of  other  coding  (save  for  linkages),  at 
linkage  or  load  time  without  impairing  the  operation  of  the  program. 
Thus  a  control  section  is  a  logical  unit,  or  in  the  sense  of  Section  4 
of  this  thesis,  a  program  segment.   The  CSECT  directive  identifies  the 
beginning  of  a  control  section.   A  tag  may  be  placed  in  the  label  field 
(to  the  left  on  the  coding  sheet)  of  a  CSECT,  thus  naming  the  section. 
All  statements  following  a  CSECT  are  assembled  as  part  of  that  control 
section  until  a  new  CSECT  directive  is  encountered.   The  object  code  for 
each  CSECT  starts  on  a  page  boundary,  and  a  page  table  (without  physical 
location  assignments,  of  course)  is  produced  as  the  section  is  assembled. 

The  COM  directive  identifies  common  coding  blocks  which  may  be  re- 
ferred to  by  more  than  one  independent  assembly  when  the  assemblies  and 
the  common  block  have  been  linked  and  loaded  as  one  overall  program. 
"Blank"  common  sections  may  contain  only  data  placed  there  during  pro- 
gram execution.   Named  common  sections,  however,  may  contain  instruc- 
tions, constants,  or  data,  in  any  combination. 

It  is  the  PSECT  directive  which  provides  for  the  linkage  of  calling 
programs  to  re-entrant  common  routines.   The  chief  problem  here  is  the 
handling  during  execution  of  temporary,  or  "working",  storage  required 
by  the  routine  for  each  program  which  is  concurrently  using  it.   In  the 
Model  67,  this  matter  is  resolved  by  the  setting  up  of  an  individual 
working  area  for  each  calling  program  within  that  program's  own  virtual 
memory.   Re-entrant  routines  in  this  computer  appear  to  have  different 
address  space  assignments  to  different  programs,  although  their  actual 
physical  locations  remain  unchanged.   Thus  when  control  is  transferred  to 
such  a  routine,  the  calling  program  must  specify  an  "address  constant", 
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a  quantity  which  reflects  its  virtual  memory  assignments,  in  order  that 
the  routine  may  obtain  a  working  area  therein  for  this  caller.   The 
prototype  control  section  is  defined  for  re-entrant  process  use  to  handle 
these  address  constants  and  working  storage  assignments. 

Within  a  re-entrant  routine  all  working  storage  and  address  con- 
stant requirements  are  placed  within  a  prototype  control  section.   This 
section  forms  a  special  subdivision  of  the  re-entrant  process.   When  the 
routine  is  called,  a  copy  of  the  contents  of  the  prototype  control  sec- 
tion is  made  and  assigned  to  virtual  memory  locations  within  the  calling 
program.   Thus  a  working  storage  area  and  proper  address  transfers  are 
established  in  and  for  the  calling  program.   All  of  this  is  transparent 
to  the  user;  he  need  not  know  any  of  the  internal  requirements  of  the 
re-entrant  routine  which  he  is  employing. 

Lastly,  one  or  more  operands  may,  quite  usefully,  be  included  with 

a  CSECT,  COM,  or  PSECT  directive  in  order  to  specify  certain  attributes 

of  the  control  section.   These  operands  include: 

PUBLIC     -    indicates  that  the  control  section  contains 
matter  to  be  accessible  to  any  program 

REENTRANT      -        indicates    that    the   section's    coding  may   be 

re-executed   from  any   point    after    interruption 

VARIABLE        -        denotes   that   the   section's    length  may  vary 
during   program  execution 

READONLY        -        indicates    that    the   section   contains    instruc- 
tions  or  data   which   are   never  modified 

Evaluation 
Consideration  of   the   extensive   relocation  hardware    incorporated    in 
the    System/360,   Model    67    leaves    no   doubt    that    its   designers    are   attempt- 
ing  to   make   thorough   provision   for   the   program  movement    and    addressing 
requirements   of   time-shared,    multiprogrammed    computations.      However, 
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without  a  completed  operating  system  and  user  tests,  it  is  difficult  to 
assess  the  relocation  performance  of  this  computer.   The  importance  of 
the  operating  system  as  it  uses,  or  fails  to  use,  the  features  of  the 
hardware  to  produce  an  efficient  total  relocation  method  cannot  be  over- 
emphasized . 

The  address  space  of  at  least  16  million  bytes  (four  million  32-bit 
nominal  words)  appears  sufficiently  large  to  allow  the  Model  67  to  ef- 
fectively use  segmentation.   That  is,  virtual  memory  can  readily  hold 
very  large  programs  together  with  a  full  library  of  common  routines, 
while  having  a  further  allowance  for  variable-size  data  structures-   The 
small  number,  16,  of  segments  provided  in  the  translation  of  standard 
24-bit  addresses  suggests,  however,  that  these  segments,  while  useful 
in  the  hardware  translation  itself,  will  not  serve  as  the  immediate 
logical  subdivisions  of  programs.   For  the  latter,  groups  of  pages  will 
be  more  appropriate  in  available  number  and  length,  as  required  during 
the  assembly  of  control  sections. 

The  size  of  the  individual  program  page,  4,096  bytes,  is  adequate 
to  hold  many  shorter  routines  or  data  areas.   Yet  for  computing  environ- 
ments where  longer  programs  are  the  rule,  this  short  length  may  lead  to 
considerable  page-turning  in  and  out  of  core.   This  is  a  critical  subject, 
for  how  much  of  page  movement  overhead  may  be  really  submerged  by  input/ 
output  independent  of  processing?   May  not  the  central  processor,  in 
this  complex  multi-level  store  system,  have  to  do  a  significant  amount 
of  initialization  and  set-up  before  turning  over  the  operation  to  a 
channel?   If  the  use  of  strict  demand  paging  results  in  excess  overhead 
time,  it  will  be  necessary  to  make  some  modification  to  the  page-turning 
algorithm.   It  might  be  desirable  to  ensure  that  core  contains  several 
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pages,  rather  than  just  one,  of  a  program  about  to  execute,  or  even  to 
limit  a  user  to  programs  occupying  some  reasonable  subset  of  total  core 
and  then  automatically  to  bring  in  his  entire  program  before  he  executes. 

The  SDS  940  time-sharing  system,  for  example,  does  the  latter;  there, 

.  .  14 

programs  are  normally  limited  to  16K  of  a  64K  word  maximum  core.     A 

further  reason  to  limit  users  to  a  subset  of  core  is  to  minimize  the 
length  of  the  segment  and  page  tables  that  must  be  handled  by  the  sys- 
tem.  With  enough  concurrent  users,  the  core  space  occupied  by  these 
tables  may  become  significant;  if,  in  such  case,  some  of  the  tables  are 
swapped  in  and  out  during  program  exchange,  a  further  addition  is  made 
to  system  overhead. 

A  definite  liability  in  the  Model  67  is  that  part  of  relocation  over- 
head due  to  address  translation.   Extra  time  is  always  required,  even 
when  the  associative  memory  alone  is  used.   Some  smaller  time-sharing 

systems   with    single-level   mapping    (no    segments,    only   pages)    such   as,    again, 

15 
the   SDS  940,        are  able  to   perform   address   translation  with   no    increase 

in   execution   time. 

Finally,    from  the   point   of   view  of   the  Digital   Control    Laboratory's 
requirements,    the  major    idea  obtained    from   study  of    the   IBM   System/360, 
Model    67    is    a   realization   of   the  complexity   of    segmentation.      This   con- 
cept,  whose   complexity    is    apparent    in   both  hardware    and    supporting   sys- 
tems  programming,    is    far  more   difficult   to    implement   than   to   describe. 


14SDS    940   Computer   Reference   Manual    (SDS   Publication   900640A, 


August    1966),    p.    8. 
15Ibid. 
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6.      Implementation   -   Digital  Control   Laboratory. 

The  Computing  Environment 

The  D,igital   Control    Laboratory,    a   facility   of   the  Department   of 
Electrical   Engineering,    serves   as   a  tool  of   research   and   instruction 
for   faculty   and    students   of   the   Naval    Postgraduate   School.      Many  pro- 
jects,  most   of   which   are    associated   with   coursework  or   theses,    are 
accomplished   here  during   the   academic   terms.      For   example,    all    students 
in   the  beginning  course    in  digital    computers   offered   by   the  Department 
of   Electrical  Engineering  currently   perform  at    least   one-half    their 
laboratory   work   in   the  D.C.L.      While  there   are    some   extensive   projects, 
most    are  quite    small,    being  measured,    in  terms   of    digital    computer   pro- 
gram  lengths,    in   tens    and  hundreds   of    instructions. 

A  particular   competence  has   been   developed    in   the   use   of    cathode- 
ray   tube   displays   and    in   hybrid   computation.       In   fact,    the   Laboratory 
presently   contains   the   only   display   and   hybrid   equipment    available   at 
the  Naval   Postgraduate    School.      Much   advanced    course    and   thesis   work 
has    been   performed   with   the   aid  of    this   equipment,    in   applications   such 
as    tactical   warfare    simulation   and    sampled-data   control   systems.      Con- 
sidering the  value  of   this    work  to    the   Department   of   the   Navy,    its 
continuance   is    important   and  even  necessary.      Participating   officer   stu- 
dents  gain   experience  which  may   be    invaluable   to    them   in   future    assign- 
ments. 

The  new  computer   system  which    is   currently   being  obtained   for   the 
D.C.L.  will    significantly   expand   and   enhance    its    capabilities.      The 
principal    item  ordered    is    a   Scientific    Data    Systems    Model    9  30   digital 
computer,   which    is    a    1 . 75-microsecond  memory   cycle,    24-bit   word   machine. 
Two   keyboard   cathode-ray   tube   displays,    each  capable   of   operation   in 
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character,  vector,  and  point  modes,  are  also  included.   These  displays 
do  not  contain  any  internal  memory;  because  of  this,  all  information 
being  presented  on  them  will  have  to  be  stored  in  the  memory  of  the  SDS 
930.   There  will  be  a  new  analog  computer  and  the  necessary  analog- 
digital  converters  for  connection  to  the  digital  computer.   For  the  first 
time  in  the  D.C.L.,  a  nearly  full  range  of  standard  digital  computer 
peripheral  equipments  will  be  available;  these  are  a  card  reader,  line 
printer,  paper  tape  reader  and  punch,  and  two  magnetic  tapes. 

Development  of  a  limited,  internal  time-sharing  system  is  envi- 
sioned as  the  best  means  to  make  full  use  of  all  this  equipment.   The 
concurrent  operations  thus  provided  should  reduce  problem-solution  time 
and,  it  is  hoped,  allow  a  closer  interface  between  user  and  machine. 
Because  of  their  greater  potential  in  these  respects,  the  two  displays 
will  receive  preference  over  the  standard  peripherals  in  service  re- 
ceived from  the  SDS  930.   Small-scale  batch-processing  using  the  card 
reader,  line  printer,  and  paper  tape  system  in  the  background  is  planned, 
however.   In  fact,  two  priorities  are  envisioned  for  this  background 
computing.   The  higher  would  be  assigned  to  normal,  short  programs;  the 
other  would  be  for  the  infrequent  long  program.   It  is  hoped  to  later 
include  hybrid  computation  within  the  capabilities  of  the  time-sharing 
system.   When  this  is  done,  however,  the  highest  scheduling  priority  in 
the  SDS  930  will  probably  have  to  be  accorded  to  its  hybrid  program, 
because  of  the  latter' s  relatively  rigid  requirements  for  execution  at 
fixed  time  intervals. 

The  overall  goal  of  the  time-sharing  system  proposed  for  the  Digital 
Control  Laboratory  may  be  stated  as  follows:   improvement  in  service  to 
all,  but  with  preference  to  display  and  hybrid  computing. 
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Pertinent   Features  of  the  D.C.L.  Digital   Computer 
The   SDS   930   computer  being  delivered  to   the    Digital   Control  Labora- 
tory16 has  many   features   which  will    influence   the  choice  of   relocation 
method  to  be   used   in  the   proposed  time-sharing  system.      In  general,    this 
computer  may  be  characterized   as   a   later   second-generation  machine,    not 
designed  for  multiprogramming  or  time-sharing. 

There  will   be   16,384  words  of   core  memory.      This   amount   at   first 
seems  most    adequate,    considering  the   probable  small   size  of  most   user 
programs.      However,    in  the   time-sharing   system,    all   of  this   memory  will 
not   be   available  for  user  programs.      Space  must   be   reserved  for  the  resi- 
dent  portion  of   the  operating  system  and   for  a  buffer   for  each  of  the 
displays.      The  relocation  method  used  will  probably  affect  the  size  of 
the  operating   system,    including  the  core  resident. 

Secondary   storage   is   provided    as   a   131,072-word   rotating  disc. 
This  disc   is   unusual   in  that   a  read/write  head   is    included  for  each 
track,   thus  eliminating  head  positioning  time  when  access   is  made.      At 
1710   revolutions   per  minute,   the  average  rotational   latency  time   is    17.3 
milliseconds,   while  the  actual  transfer   rate   is   117,000  words   per   second. 
This    last    is   the  figure  when  more  than  one  disc   sector    (a  sector  holds 
64  words)    is   accessed  during  a  transmission;    it    is    somewhat    lower  than 
the   single-sector  rate   because  of   intersector  gaps.      At   this   speed,   the 
entire   16K  core  memory  can  be  copied  onto   the  disc   in  0.175   seconds, 
which   includes   the  maximum   latency  time  of    35  milliseconds.      On  this   disc 
the   sector  address    is   automatically   incremented  during  a  multiple-sector 

16Requisition  N62271-67-C-0013,  13  October  1966,  from  Supply  & 
Fiscal  Officer,  Naval  Postgraduate  School,  to  Navy  Purchasing  Office, 
Washington,   D.C. 

17SDS  940  Computer  Reference  Manual,   pp.    75-78. 
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transfer.      Manually-controlled    prevention  of   writing   operations    is    pro- 
vided  for  each  of   the   four    32,768-word   blocks;    this    feature  will   prove 
useful    in  preserving  permanent    files,    such   as   disc-resident   portions   of 
the  operating  system,    against    inadvertent  destruction. 

The   two   displays    and   the   analog  equipment    will   be   connected   to   the 
SDS   930   through  separate,    direct    accesses   to   core  memory.      Except   for 
initialization,   these   devices  may  operate    independently  of   the  central 
processor,    under  the  following   condition.      Core  memory   is   divided    into 
two   8K  blocks;    when   a   display  or    analog    input/output    transfer    involves 
the  block  other  than  the  one  which   the  central    processor    is   currently 
accessing,    the   two   actions    are    independent,    and  the  processor   is   not   held 
up  at   all.      However,    when  the  transfer  operation   and   processor   simultane- 
ously  use   the   same  memory  bank,    the   transfer  will   take  precedence,    and 
the   processor  will    be   delayed  one   memory  cycle    time.8      This    fact    suggests 
that   due  to  display   refresh  requirements,    as    far   as   possible   user   pro- 
grams   and   the  display   buffers   should  occupy  different  memory  banks. 

In  constrast ,    all   the   other   peripheral   devices,    including   the  disc, 
will   be   joined  to   the   digital   computer  through  what    is   termed  a   time- 
multiplexed   communication  channel.      This   channel   shares   use   of   an   inter- 
nal  register  with  the   central    processor,    and    input/output   operations   on 
it   always    involve    cycle-stealing.  This    is    not   to    say   that    input/output 

cannot    take  place    concurrently  with   computing,    for    it    can,    but    computing 
time   will    increase   by   the   number   of   memory   cycles    used    for   the    input/out- 
put   operation,    at    a   rate  of   two    cycles    per  word   transferred. 

18SDS   930    Computer  Reference   Manual    (SDS   Publication   900064D, 
February,    1966),    p.    28. 

19SDS   930   Computer   Reference  Manual,    p.    25    and   p.    28. 
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The   SDS   930   possesses   no   hardware   aids   to  the  address   translation 
part   of   program  relocation.      In  particular,    the   fourteen-bit   address 
field  of   instruction  words   provides   an  address   space  or  virtual  memory 
exactly  equivalent   to   the   physically- implemented    16K  of   core.      There   is 
no   provision  for  translation  on   a   segment,    page,   or   program   basis. 
Finally,    there  will  be   no   core  memory  protection  feature.      Such  feature 
is   a  very   desirable  corollary  to   any  address   translation  method.      An 

SDS  option  which  provides   write  lock-out   protection   in  512-word  blocks 

20 
of   core   is    available,        but    it  was    not  ordered. 

The    software  or   systems   programming  to   be   furnished  with  the  D.C.L. 

SDS   930   is,    with  one   exception,    designed   solely   for  non-mult iprogrammed , 

non-interactive  batch-processing.      It    includes   MONARCH,    a  magnetic  tape- 

21   22 
oriented  operating  system.      '        A  disc-resident  version  of  MONARCH  is 

now  in  preparation.      There    is   a  second  operating   system,    Real-Time 

23 
MONITOR,        now  being  written.      It   too   is  disc-resident. 

MONARCH  provides   batched   assemblies,    compilations,   and  executions 

in  any  combination  for  any   number  of   programs.      Its    language  processors 

are   SYMBOL,   META-SYMBOL,    FORTRAN   II,    and  Real-Time  FORTRAN    II.24      Input/ 

output   devices   which   it   will   handle  are  card   reader   and  punch,    line 


20Ibid. ,    p.    4. 

21SDS  MONARCH  Reference  Manual,    900   Series/9  300  Computers    (SDS 
Publication  900566B,   August    1965). 

22SDS  MONARCH  Technical  Manual,    900   Series/9  300  Computers    (SDS 
Publication  900616B,    October    1965). 

23SDS  Real-Time  MONITOR  Reference  Manual    (SDS   Publication  901108A, 
February   1966). 

Oil 

ALGOL  is  available  upon  request. 
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printer,   magnetic  tape,    paper  tape    reader  and  punch,   typewriter,    and    (for 

the  disc  version  only)   disc-      MONARCH  also   includes   the    library  of   SDS 

25 
programmed   operators. 

Real-Time  MONITOR   is    intended   to   add   a  generalized   interrupt-handling 
feature,    not   interactive  time-sharing,   to   a  standard   batch-processing 
executive.       Its   processors    include   FORTRAN   IV,    SYMBOL,    and   META-SYMBOL. 
It   will   handle  the   same  peripherals    as    Disc   MONARCH  and  also   has   the 
programmed  operator   library. 

Both  operating   systems    include   relocating   loaders.      These   routines 
are    intended   strictly   for   use    in  non-mult iprogrammed  batch-processing. 
They  make  no   provision   for  program  movement  or   address   translation   after 
execution  has   once   started. 

A  special   display   program  is   the  only   software   item  being   furnished 
which   immediately  provides    a  capability  for  more  than  batch-processing. 
It  offers    a  very  basic   facility  for  on-line   utilization  of   the  digital 
computer   from  the  display  consoles.      It   allows    source-language   program 
creation,    including  editing,    at   a  display   and   provides    for   transmission 
of  the  prepared   program  to   such  storage   as  magnetic   tape  or  disc.      When 
a  program   is    ready  for   assembly  or   compilation,    the  display  may  then 
function  as   the   system  control  medium  to   bring   in  MONARCH  to   perform  the 
desired   processing.      There   is   no   further   interaction  with  the  program 
until  MONARCH  has   finished   and,    if    requested,    the   program  has   executed. 
The  MONARCH  system  and   the  display   program  will   not   reside   in  core  memory 
or  operate   at   the  same  time. 

25SDS   930  Computer  Reference  Manual,    p.    A-17. 
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The   Key   Factors 
Of   all  the  mentioned   features   of   the  Digital  Control    Laboratory   and 
its   new  computing   system,    the   following   are   considered    to    be   the  most 
important    in  terms   of   their  effect    upon  the  choice  of  method   for  program 
relocation: 

a)  most   programs   will   be   small,    containing  hundreds,    rather 

9  ft 

than  thousands,  of  instructions; 

b)  there  will  be  only  two  interactive  users,  at  the  displays; 

c)  a  disc  with  a  high  transfer  rate  and  low  latency  time  is 
to  be  available;  and 

d)  the  computer  has  no  hardware  or  programming  aids  designed 
to  facilitate  any  method  of  relocation. 

These  are  the  key  factors  to  be  remembered  in  the  analysis  below. 

Relocation  Analysis 

The  use  of  segmentation,  of  any  two-level  address  translation  scheme, 
appears  to  be  neither  warranted  nor  feasible  in  the  Digital  Control  Lab- 
oratory.  The  limited  address  space  of  the  SDS  930  would,  in  itself, 
prevent  the  gaining  of  the  advantages  of  segmentation.   In  addition,  the 
complexity  of  the  necessary  hardware  and  programming  would  be,  relatively, 
immense.   The  System/360,  Model  67  is  good  proof  of  this  last  point. 

Employment  of  a  single-level,  blocks  and  pages  method  of  program  re- 
location would  theoretically  allow  the  most  efficient  allocation  of  core 
memory.   Further,  address  translation  may  be  accomplished  in  hardware  at 
small  cost  in  terms  of  execution  speed.   Recalling  that 

9  ft 

This  point  might  be  questioned  in  view  of  the  expanded  capabil- 
ities of  the  D.C.L.   However,  with  the  System/360,  Model  67,  the  NPGS 
Computer  Facility  has  received  a  sizeable  increase  in  its  computing  power 
and  should  continue  to  attract  most  large  programs. 
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t 
FA  =  — - —  f  (4.4) 

a    ntm 

it  is  calculated  for  the  SDS  930,  where  t   is  1.75  microseconds,  with 


m  =  3 


50  st    s  200   nanoseconds 
a 

that  0.005SF    S0.025 

That    is,    program  execution  time  would   be    increased,    at  most,    by   about 
2.5%.      A  possible  paged   address   translation  scheme   for  the  D.C.L.    is 

shown   in  Figure   15.      It    is  modelled  upon  the  mapping   performed    in  the 

27 
SDS    940,        which  also   employs    2,048-word  pages.      The   advantage  of  using 

such   a  page   size   in  the  D.C.L.    is   that   this  makes    possible  the  use  of 
two  mapping  registers,    as    in  the  940.      This   similarity  would  reduce  the 
original  design  effort   required   for  the    implementation  of  blocks   and 
pages    in  the   D.C.L.      Otherwide ,    shorter  program  pages,    perhaps    1,024 
words,   would   probably  be   advisable    in  the  Laboratory   in  view  of   the  many 
small   user   programs.      Unlike  the   940' s,    the  translation  scheme   shown  does 
not   provide  for   a   physical   core  memory  of   64K  words;    instead,    provision 
is   made  only   for   a  core  of    32K,   which  seems    a  reasonable   limit   to   poten- 
tial  D.C.L.    expansion   in  view  of  the   small   number  of  concurrent   users. 
The   extra  bit   thus  made   available   is   to   be  used   for  core  memory   protec- 
tion,   which,    now  with  two   bits   per  block,    could    include   four  forms.      The 
SDS   940,   which  reserves  one   bit   per  block   for  such  protection,    thus  pro- 
vides  only   two   forms. 


27SDS  940  Computer   Reference  Manual,    pp.    8-9. 
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Figure  15.   Possible  page/block  address  translation  scheme  for  the  D.C.L. 
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Use  of  the  blocks  and  pages  method  of  program  relocation  in  the 
D.C.L.  would  require  extra  expense,  considering  the  relocation  registers 
and  additional  logic  needed.   Coding  would  also  have  to  be  written  to 
dynamically  allocate  the  core  memory  and  to  update  the  relocation  regis- 
ters upon  program  exchange.   Presumably,  however,  programs  written  for 
these  purposes  under  Department  of  Defense  sponsorship  at  the  University 
of  California,  Berkeley,  for  its  modified  SDS  930  would  be  available,  so 
the  original  effort  required  in  this  respect  at  the  D.C.L.  would  be  re- 
duced.  The  primary  disadvantage  of  using  blocks  and  pages  in  the  Labora- 
tory is  still,  however,  the  relative  complexity  of  implementation.   It 
cannot  be  denied  that  the  effort  required  would  be  considerably  more  than 
that  necessary  if  the  relocation  register  or  "no  relocation"  methods  were 
chosen.   Further,  the  small  number  of  D.C.L.  users,  coupled  with  the  fact 
that  a  high-speed  disc  is  available,  suggests  that  the  time  advantages 
gained  from  paging  would  be  minimal.   That  is,  very  fast  swap  times  can 
be  attained  on  a  program  basis,  unpaged,  as  will  be  shown  later  in  this 
sect  ion. 

Employment  of  a  relocating  register  would  be  less  complex  than 
blocks  and  pages,  while  still  providing  potentially  for  more  than  one 
user  program  to  be  in  core  memory  at  one  time.   Programs  would  have  to 
be  moved  as  one  contiguous  block,  but  this  might  not  be  a  significant 
disadvantage  when  most  are  small.   Considering  again  this  factor  of  size, 
the  swapping  out  of  a  user  program  when  interrupted,  to  provide  space  for 
an  incoming  process,  will  not  always  be  required.   Recalling  (4.13),  it 
can  be  seen  that  this  will  keep  down  the  program  exchange  time.   Of 
course,  since  t  f   0  with  the  relocating  register,  there  will  be  some  in- 
crease  in  program  execution  times  due  to  the  address  mapping  time.   The 
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amount  would  be  small,  however,  of  about  the  same  magnitude  range  as  cal- 
culated above  for  the  blocks  and  pages  method. 

This  relocation  method  does  necessitate  the  design  and  construction 
of  the  mapping  register  and  associated  logic.   The  latter  involves,  in 
particular,  the  screening  of  instruction  codes  during  execution  to 
select  those  for  which  address  translation  will  be  performed.   Program- 
ming will  also  be  needed  to  keep  track  of  available  space  within  core 
memory  and  to  provide  shifting  of  user  programs  within  core  in  order  to 
free  space  for  an  incoming  process.   The  relocating  register  method  also 
requires  the  incorporation  of  memory  protection,  usually  in  the  form  of 
two  bounds  registers  which  restrict  the  range  of  access  of  the  program 
being  executed.   If  this  method  of  relocation  were  to  be  chosen  for  the 
D.C.L.,  some  assistance  in  its  implementation  could  probably  be  obtained 

from  The  RAND  Corporation,  which  employs  it  on  a  PDP-6  computer  in  the 

98  99 
JOSS  time-sharing  system.   ' 

The  final  method  to  be  considered  is  "no  relocation",  or  the  swap- 
ping of  jobs  upon  program  exchange  with  no  address  mapping  within  core 
memory.   The  program  to  be  executed  next  is  always  loaded  starting  at  the 
same  address.   In  the  simplest  application,  which  is  that  which  is  con- 
sidered here,  the  active  user  program  is  the  only  user  program  in  core 
memory;  this  is  called  complete  job  swapping. 

The  disadvantage  of  this  method  is  the  large  amount  of  program  move- 
ment into  and  out  of  core.   Except  when  a  program  is  ended,  an  "out" 
movement  is  required  upon  program  exchange,  and  an  "in"  movement  is  al- 
ways necessary. 

98 

Interview  with  R.L.  Clark,  The  RAND  Corporation,  Santa  Monica, 

California,  9  February  1967. 

2  Bryan,  G.E. ,  JOSS:  User  Scheduling  and  Resource  Allocation  (The 
RAND  Corporation,  Memorandum  RM-5216-PR,  January  1967),  pp.  2-4;  17-18; 
39-47. 
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On  the  other  hand,    "no   relocation"  would  certainly  be  the  simplest 
method  to   add  to    a   fundamentally  non-time-sharing  computer   such  as   the 
SDS   930.      Its   allocation  of  core  memory  to  one  user  program  at   a  time 
emulates  that  of  the   non-mult iprogrammed,    batch-processing   software  being 
furnished  with  the  machine.      Thus,    if    it   were  to  be   the  method    chosen, 
more  of   this   extensive   programming  might   be  usable   in  the  D.C.L.      Since". 
t      is   zero,    there   is   no    increase    in  program  execution  time  due  to   address 

c* 

mapping.   Also,  the  core  memory  protection  required  will  be  minimal.   The 
relative  simplicity  of  this  method  can  be  counted  upon  to  produce  the 
smallest  size,  N,  of  relocation  program. 

The  time  taken  for  program  exchange  with  "no  relocation"  may  be 
tolerable  in  the  Laboratory  for  two  reasons:  there  are  only  two  inter- 
active users,  for  whom  response  time  is  most  critical,  and  there  is  a 
high-speed  disc.   The  latter  provides  for  very  fast  program  swap  times, 
such  as  the  following: 

Operation  Time 

Swap  out  and  in  a  IK  0.09  seconds 

program 

Swap  out  a  IK  program;  0.12     " 

swap  in  a  5K  program 

Swap  out  and  in  a  5K  0.16     " 

program 

All  of  these  times  include  the  maximum  latency  time  for  both  the  "out" 
and  "in"  transfers.   In  addition,  they  are  calculated  for  completely 
non-overlapped  input/output.   Thus  they  are  absolute  maxima,  and  yet  they 
are  short  in  terms  of  human  reaction  times. 

These  times  may  be  directly  compared  to  those  required  if  a  reloca- 
ting register  method  were  implemented.   The  comparison  will  be  based  upon 
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the  technique  developed  in  Section  4  of  this  thesis.   The  variable  is  L, 
the  average  program  length.   It  is  again  assumed  that 


^R^NR  <*•"> 


and  that      (T   )MD  =  ntmL  (4.12) 

em  NR     m 


Thus  the  following  expressions  apply: 


_2 


_       nt  L 

(T  )    =  t  Lf  +  61  m  +  2 (4.15) 

r  RR    a       nr       M 


(VNR  =  2INR  +  2ntmL  (4-16) 


Suitable  values  are  chosen  for  the  D.C.L.  930: 


t  =  100  nanoseconds 
a 


f-l 


I   =  500  microseconds 
NR 


n   =  2 


t      =1.75  microseconds 
m 


M      =   8000   words    (i.e.    one-half   the   SDS   930"s 

core  memory  is  available  for  user  programs) 


The  results  are: 


<T  L  =  6.7(10~5)L  +  3  +  8.75(10"7)L 
(Tr)NR  =  1  +  7(10~3)L   (milliseconds) 

These  expressions  are  plotted  in  Figure  16  for  0"<LS1000  words,  i.e. 
for  the  short  program  lengths  expected  in  the  D.C.L.   The  plot  shows  that 
at  these  lengths,  the  relocating  register  method  has  little  time  advantage 
over  the  simpler  "no  relocation".   The  maximum  advantage  of  the  relocating 


75 


8- 


6- 


T  , 
r 

milli- 
seconds 


"No   relocation" 


Relocating  register 


0" 


■tiw 


200 


600 


800 


1000 


L,   words 


Figure    16.      Relocation  time   in  the  D.C.L.;    use  of   relocating 

register   compared  to  employment   of   "no  relocation" 


76 


register  is  only  4.05  milliseconds,  at  L  =  1000.  In  fact,  until  L  =  300, 
"no  relocation"  is  actually  faster,  due  to  the  larger  initialization  time 
associated  with  the  relocating  register  method. 

Finally,  "no  relocation"  in  the  sense  of  complete  job  swapping  has 
been  demonstrated  through  a  number  of  successful  applications  to  be  a 
practical  method  for  use  under  conditions  of  full-scale  time-sharing.   It 
is  the  relocation  method  now  employed  in  the  General  Electric  Company's 
commercial  time-sharing  system,  which  serves  up  to  40  concurrent  users. 

It  is  also  employed  at  the  System  Development  Corporation,  for  the  31 

30 
users  of  its  AN/FSQ-32  system. 

Recommendation 

Considering  the  requirements  of  the  Digital  Control  Laboratory,  there 
is  no  need,  in  program  relocation,  for  more  than  complete  job  swapping. 
The  time  taken  for  swapping  would  not  be  excessive,  and  the  relative 
simplicity  of  implementation  is  most  attractive.   It  is,  then,  the  recom- 
mended method.   When  the  programs  are  to  be  exchanged,  a  transfer  out  of 
the  entire  old  user  program  would  occur,  and  the  new  user  program  would 
be  loaded  beginning  at  one  fixed  address. 

Only  if  some  large  programs  are  found  to  be  a  common  occurrence  in 
the  new  system,  may  one  refinement  be  found  worthwhile.   This  would  be 
to  swap  out,  upon  program  exchange,  only  so  much  of  the  old  user  program 
as  is  required  to  free  sufficient  core  space  for  the  new  user  program. 
When  the  old  user  is  large,  and  the  new  user  is  small,  relatively,  a 
significant  amount  of  time  may  be  saved  by  thus  avoiding  the  unnecessary 
relocation  of  the  entire  old  user  program.   At  some  point,  the  time  saved 


30 

Interview  with  E.  Myer,  System  Development  Corporation,  Santa 


Monica,  California,  14  February  1967 
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should  become  greater  than  that  required  to  execute  the  extra  coding 
needed  to  keep  track  of  how  much  of  each  user  is  in  core  at  any  moment. 
For  details  of  a  possible  implementation  of  complete  job  swapping 
in  the  Digital  Control  Laboratory,  see  Appendix  II  of  this  thesis. 
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7.      Conclusion. 

In  order   to   be   able   to   make    a   recommendation   for   the   new  computing 
system    in   the  Digital   Control    Laboratory,    this   thesis    investigated   the 
general    subject   of    program  relocation   in   a  multiprogramming  environment. 
Four   methods   of    program   relocation   were    identified   and    analyzed: 

a)  "No   relocation",    now   used,    for   example,    on   the  General 
Electric    Company's    commercial    time-sharing   system; 

b)  Relocating   register,    successfully   employed   at  The   RAND 
Corporation; 

c)  Blocks    and    pages,    featured    in  the   SDS   940;    and 

d)  Segmentation,    implemented    in   the   IBM  System/360,   Model    67. 
Basic   upon   the   D.C.L.'s    specific   requirements,    a   recommendation   for   use 
of    "no   relocation"  was   made.      Thus    the  announced   aim  of   the   thesis   was 
achieved. 

One   general   conclusion  other   than  the   recommendation  was    reached 
during  the  writing   of   this    thesis.      As    research   progressed,    it    became 
evident   that   the  technical   elegance,    in    itself,    of  a   relocation  technique 
is    a   very   poor   criterion   upon  which   to   base   a   choice   of   method    for   a 
particular    system,    such    as    the    D.C.L.      First,    the  more   elegant   the  method, 
the  more   complex    its    implementation.      Second,    the   relocation  methods 
studied   all    differ    in   elegance,    yet    each  has    found   practical    application. 
The   reason   for   this    is    that    far  more    important    factors   than   elegance    are 
found    in   the  environment   and   features   of   the   target    system.      What   was 
learned    in   the  writing  of   this   thesis    is   that    these   factors   must    be    iden- 
tified,   for   they,    properly,    will    affect   most    the  choice  of    relocation 
method.      Technical   elegance    is    a    secondary   consideration. 
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APPENDIX  I 
SUMMARY  OF  MAJOR  DEFINITIONS 


Batch-processing 


Demand  paging 


Logical  address 


Mult  iprogramming 


that  operation  of  a  computing  system 
in  which  programs  are  collected  into 
groups,  or  batches,  and  are  then  pro- 
cessed from  start  to  finish  without 
programmer  intervention. 

a  page-turning  algorithm  in  which  pro- 
gram pages  beyond  the  current  one  are 
brought  from  secondary  storage  into 
main  memory  only  when  referenced. 

a  memory  address  as  contained  within  a 
program;  when  relocation  is  used,  that 
which  is  translated  into  a  current 
physical  storage  location. 

that  operation  of  a  computer  which  per- 
mits the  execution  of  a  number  of  pro- 
grams in  such  a  way  that  none  of  the 
programs  need  be  completed  before 
another  is  started  or  continued. 


Program  relocation 


Project  MAC 


Real-time   computing 


within   a   computing   system,    the   physical 
movement    of    programs    and  translation 
of   program-contained  memory   addresses 
into   actual    storage    locations. 

the  on-line,    multiple-access,    time- 
sharing  computing   system  of   the 
Massachusetts    Institute   of   Technology. 

program  execution   to   satisfy   a   particu- 
lar  operational    response   time,    which 
ranges    in  different   applications    from 
microseconds   to  minutes. 


Re-entrancy 


a   characteristic   of   a  program  which  can 
be  executed   for  more  than  one   user 
concurrently;   meaning,    there   is    no 
internal   data    storage   or   address   modi- 
fication which  will   affect   results    if 
a  second  user  enters   the   program  be- 
fore  a   first    has    finished. 


82 


Time-sharing  -         that  operation  of  a  computing  system 

which  permits  a  number  of  users  to 
employ  it  simultaneously  in  such  a 
way  that  each  is  or  can  be  completely 
unaware  of  the  activity  of  the  others. 

Virtual  memory         -         or  address  space;  a  term  for  the  maxi- 
mum addressing  capability  of  a  computer, 
not  all  of  which  is  necessarily  imple- 
mented in  physical  storage. 
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APPENDIX   II 

ON   RELOCATION 

IN   THE   DIGITAL  CONTROL  LABORATORY 

The   purpose  of   this   appendix   is   to   discuss   further  the   implementa- 
tion of  program  relocation  on  the   new  computer   in  the   Digital  Control 
Laboratory.      Taken   as    a  point   of   departure   is   the  recommendation  made    in 
Section   6  that    "no   relocation",    or    job  swapping,    be  the  method  used.      Cer- 
tain assumptions   relative   to   a  time-sharing  operating   system  for  the  D.C.L, 
are  described,    and   a  Program  Status  Table  to   be  used  during  relocation   is 
defined.      Timing  considerations    and  memory   protection  requirements    are 
discussed.      Finally,    charts   showing  the  possible   flow  of   relocation  are 
presented . 

A  fundamental   premise   is   that   the  operating   system  will   be  designed 
initially,   within  the   goal   of   providing  time-sharing  between  the  two 
displays    and  other   peripherals,    to    use    as   many   portions    as    possible  of 
Disc   MONARCH  or  Real-Time  MONITOR.      This    assumption  was    certainly   a   con- 
sideration  leading  to  the   decision   to   recommend    "no  relocation",    and    it 
seems   quite   reasonable   in  view  of    the    limited   amount   of   time   which   is 
available    among  D.C.L.    users    for   writing   a   new  operating   system.      It 
does,    however,    place   restrictions    upon   the   philosophy  of   operation.       In 
particular,    it    implies    that   each  user  program  while  executing  will   have 
the   computer   to    itself,    as    far   as    core  memory    is    concerned,    save   the 
portions   reserved   for  the  operating   system's    resident    and  for  special 
functions    such   as   the  display   buffers.      This   will   provide   the  closest 
emulation  to   standard,    non-mult i programmed    use   of   MONARCH/MONITOR.      Fur- 
ther,   programs   are  to   be   formed    -    subroutines    linked,    and    a   copy   of   all 
common   routines    attached    -   before   execution.      No    attempt    is   to   be  made   to 
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refer  to  a  single  copy  of  common  matter.   With  these  restrictions  placed, 
it  becomes  possible  to  hope  to  employ  the  MONARCH/MONITOR  language  pro- 
cessors, such  as  FORTRAN  and  META-SYMBOL,  and  input /output  drivers  with 

relatively  few  modifications . 

31 
The  scheduling  program  (called  here,  SKED   )  is  to  be  the  dominant 

routine  within  the  new  operating  system.   This  is  a  reasonable  assumption 
consistent  with  its  responsibility  for  controlling  the  flow  of  jobs 
through  the  computer.   It  is  further  assumed  that  SKED  will  be  the  first 
system  routine  entered  when  execution  of  one  user  program  is  interrupted, 
and  the  last  routine  employed  before  control  is  transferred  to  the  next 
user  program.   Thus  SKED  will  be  in  a  position  to  oversee  the  housekeep- 
ing and  other  services  performed  between  user  program  quanta.   Figure  17 
is  a  representation  of  how  processing  might  flow  in  the  computer.   Further, 
SKED  must  be  responsible  for  storing  the  old  user's  machine  conditions  - 
in  the  D.C.L.  SDS  930,  this  means  the  contents  of  the  A,B,X,  and  P  regis- 
ters, and  the  status  of  the  overflow  indicator  -  and  for  setting  the 
conditions  for  the  new  user.   These  actions  will  be  the  first  and  last 
tasks,  respectively,  performed  during  the  service  period  between  user 
program  quanta. 

The  disc  will  be  used  for  storage  of  both  temporary  and  permanent 
files.   The  relocation  program,  named  RELOC,  controls  the  temporary  sec- 
tion, employed  to  hold  the  core  images  of  programs  which  have  been  inter- 
rupted prior  to  their  completion.   The  permanent  portion  contains  the 
non-resident  parts  of  the  operating  system,  including  the  language  pro- 
cessors.  If  space  permits,  this  portion  may  also  hold  certain  user 

on 

Program  names  used  in  this  appendix  are  chosen  only  for  their 
brevity,  consistent  with  some  amount  of  meaningf ulness . 
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time 


user 
program 
service 


SKED 


accounting, 
other 


SKED 


RELOC 


SKED 


T 


user 
program 
service 


store   old   user's 
machine  conditions 


determine 
next   user 


set   up  machine 

conditions   for 

new  user 


swap  out  old  user, 
load  new  user 


Figure  17.   Flow  of  processing  within  D.C.L.  SDS  930 
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programs  in  an  inactive  status,  such  as  between  working  sessions  on  a 
course  or  thesis  project.   Figure  18  shows  symbolically  the  organization 
of  the  disc. 

To  the  operating  system,  each  user  will  be,  in  fact,  a  program.   Thus 

a  person  using  the  computer  from,  say,  a  display  console  is  known  by  the 

32 
name  of  his  program,   not  by  some  other  means  such  as  his  own  name  or 

console  number.   If  he  is  assembling  or  compiling  before  executing,  his 
program  name  will  first  identify  his  copy  of  the  language  processor  which 
he  is  using.   Later,  it  will  refer  to  his  object  program  in  execution. 

There  is  a  need  for  definition  of  at  least  four  different  user  pro- 
gram statuses.  These  might  be  named  NEW,  ACTIVE,  DEAD,  and  SAVE.  Their 
meanings  would  be  as  follows: 

NEW  -         indicates  a  program  which  has  not 

yet  received  its  first  quantum 
for  execution 

ACTIVE        -         refers  to  a  program  which  has 

executed  at  least  once,  but  which 
is  not  finished 

DEAD  -         this  program  has  finished,  and  its 

core  image  may  be  discarded 

SAVE  -         this  program  has  also  terminated, 

but  it  is  desired  to  store  its 
core  image  for  future  use 

The  different  operating  system  programs  will  use  these  status  indicators 
to  determine  which  of  the  possible  alternatives  open  to  them  will  be  fol- 
lowed as  they  perform  their  functions. 

Affected  by  the  above  will  be  entries  in  a  Program  Status  Table 
established  and  used  jointly  by  SKED  and  RELOC.   Each  user  program  will 


Or,  equivalent ly ,  by  a  number  assigned  to  his  program  by  the 
system.   Using  a  number  might  save  space  in  the  Program  Status  Table 
(q.v.),  but  it  also  implies  some  name-to-number  and  number- to- name  trans- 
lation. 


1    -   Temporary   section 

la   -  temporary   files 

(core    images   of    interrupted    programs) 


2    -   Permanent   section 

2a   -   permanent    files 

2b  -  catalog  (two  parts,  as  indicated  by  dotted 
line,  one  for  system  programs  and  one  for 
saved   user  programs) 


Figure    18.      Organization  of   the  disc 


be  included  in  this  table  from  the  time  when  it  is  NEW  until  it  becomes 
either  DEAD  or  SAVE.   Permanent  entries  will  be  present  for  operating 
system  language  processors.   The  Program  Status  Table  will  be  a  part  of 
the  core  resident  portion  of  the  operating  system.   A  complete  entry  for 
a  program  will  contain  the  following  items: 

a)  program  name 

b)  size  of  core  image 

33 

c)  first  word  address  when  loaded 

d)  current  location,  i.e. 

disc,  temporary  or  permanent  section 
magnetic  tape  no.  1  or  no.  2 
core  memory 

e)  machine  conditions  to  be  set  up  before  next  execution 

of  program. 

Item  e)  will  be  principally  maintained  by  SKED,  while  RELOC  will  use  a) 
through  d)  in  performing  program  relocation.  A  possible  format  in  core 
memory  for  a  Program  Status  Table  entry  is  shown  in  Figure  19. 

The  principal  disadvantage  of  job  swapping  as  a  relocation  method 
is  the  program  exchange  time  involved.   With  so  few  users  in  the  D.C.L. 
system,  the  time  required  here  should  be  tolerable.   Nevertheless,  it  is 
certainly  desired  to  minimize  the  overhead  caused  by  relocation.   Most  of 
this  overhead  will  be  due  to  transfers  between  disc  and  core.   In  turn, 
the  time  taken  for  these  transfers  depends  upon  two  factors,  the  transfer 
rate  and  the  rotational  latency  of  the  disc.   The  former  is  a  fixed  quan- 
tity, and  the  time  required  for  actual  transfer  cannot  be  submerged  since 
the  disc  is  attached  to  the  cycle-stealing  time-multiplexed  communication 
channel.   The  effect  of  the  rotational  latency,  however,  can  be  reduced. 

^Normally,  the  FWA  will  be  fixed,  and  thus  could  be  eliminated, 
for  all  user  programs.   However,  it  may  vary  for  the  language  processors 
and  is  therefore  included. 
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entry 


+1 


current    location 
of    program 
core,    disc    (temp, 
or    perm.) ,    mag. 
tape    #1   or    #2 


+  2 


overflow  - 
indicator 


+  3 


+4 


+5 


+  6 


+7 


0 


(preceding  entry) 


M 


(A) 


(B) 


(X) 


10 


(P) 


(following  entry) 


\ 


10  23 

core  size 


10  23 

FWA 


23 


Program  name 


Machine 
conditions 


Shaded  areas  show  bits  reserved  for  table  expansion/evolution 


Figure  19.   Format  of  Program  Status  Table  entry 
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This  is  possible  because  as  the  disc  rotates,  the  current  sector  address 
can  be  read  into  the  computer  at  any  time.     Thus  RELOC  may  set  up  a 
transfer  but  not  initiate  it  until  the  proper  sector  is  under  the  read/ 
write  heads,  provided  that  other  useful  functions  are  available  to  be  per- 
formed in  the  meantime.   Or,  RELOC  may  alter  the  order  of  transfer  of  the 
words  within  the  block  being  moved  to  take  advantage  of  the  current  sec- 
tor location.   This  means,  perhaps,  that  a  transfer  may  be  initiated  with 

35 
the  middle  of  the  block,  rather  than  its  beginning.     The  added  coding 

complexity  should  be  worth  it  in  either  case,  considering  that  the  average 
rotational  latency  of  17.5  milliseconds  is  as  long  as  the  actual  transfer 
time  for  a  program  of  2,048  words.   An  attempt  has  been  made  in  the  relo- 
cation routines  presented  in  this  appendix  to  follow  the  set-up  of  a 
transfer  operation  with  another  function  which  might  be  accomplished  while 
waiting  for  the  disc  to  rotate  to  the  proper  sector.   However,  since 
latency  times  are  measured  in  milliseconds,  the  provision  of  enough  func- 
tions to  submerge  a  major  part  of  the  expected  time  in  this  way  would 
certainly  involve  use  of  other  operating  system  programs  not  discussed 
within  this  thesis.   It  is  difficult  to  say  more  about  timing  until  de- 
tails are  known  for  the  disc  input/output  handler  to  be  furnished  with 
Disc  MONARCH  and  Real-Time  MONITOR.   This  routine,  it  is  assumed,  will  be 
investigated  for  possible  use  in  the  D.C.L.  operating  system,  and  any  em- 
ployment of  it  may  well  affect  time  considerations. 

A  modest  form  of  core  memory  protection  would  be  very  desirable, 
even  in  the  D.C.L.  system  where  only  one  user  program  is  to  be  in  core 

3Z+SDS  940  Computer  Reference  Manual,  p.  77. 

JJIbid.  An  example  of  the  use  of  this  technique  is  given.  As  547 
microseconds  are  required  for  one  sector  to  move  by  the  read/write  heads, 
there  is  considerable  time  available  for  block  manipulation. 
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at    any   one  moment.      This    is    so   that   user   programs  do    not    inadvertently 
destroy  parts  of  the  operating   system  resident,    thus   preventing  continu- 
ous  operation  of   the   time-sharing   system.      The   experience   of   the    System 
Development   Corporation   in  this    respect    is    informative;    with  no   memory 
protection,    the   AN/FSQ-32   time-sharing   system  never   ran   for    longer   than 
ten  minutes   before   a  user  destroyed   a  portion  of  the  resident  executive. 
Bounds   registers   are   now  installed,    limiting  the  range  of   access   of   each 
user   program.  One  way  to   provide  protection  of   specified   areas  of 

core  memory   in  the  D.C.L.    system  would   be  to   purchase   the   SDS   930   memory 
write    lock-out    feature.      This    option   allows    program-   or   manually-controlled 
prevention  of  writing   into   any  or   all   512-word  blocks    in  core;    when  an 

attempted  violation  occurs,    a    "no  operation"   and  trap  to   a   fixed    location 

37 
result.  Addition  of  this   feature  would,   of  course,    involve   extra  ex- 

pense.     It   would   also    be  possible   to   design   an   implementation  of  memory 
protection  at   the   Naval   Postgraduate  School.      One  method  would   involve 
a  single  bounds   register.      This  method   would   be   feasible   if   all   that    is 
to   be  protected   -   the  operating   system  resident   and    any  other   reserved 
areas   -    is    located  either  above  or  below,    in  core  memory,    the   user   program 
area.      The  bounds   register,    functioning   for  specified  operation  codes   when 
user   programs   are  executing,    would   cause   a  trap  whenever   the   instruction 
address   specified   a   location  within  the  protected   area.      Figure   20   shows 
an   allocation  of    core  memory    in  which   the   resident    and    its    tables,    forming 
the   protected   area,    are  placed   at   the  uppermost   addresses.      This  method 
of   protection  is  quite  restricted    in  flexibility,    but    its   relative  simpli- 
city   is    in   keeping   with   the  goals   of   the    D.C.L.    system. 

36 Interview  with  E.    Myer,    System   Development   Corporation,    Santa 
Monica,    California,    14   February    1967. 

37 SDS   930   Computer  Reference  Manual,    p.    4. 
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Figure  20.   Possible  allocation  of  core  memory  in  D.C.L.  SDS  930 
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The  routines  suggested  for  RELOC  are  charted  in  Figures  21  through 
25.   They  implement  job  swapping  in  keeping  with  the  thesis  recommendation 
and  with  the  assumptions  made  in  this  appendix.   The  overall  relocation 
process  is  shown  in  Figure  21,  including  the  entering  arguments  from  SKED. 
The  remaining  figures  depict  the  two  major  routines,  DUMP,  which  swaps 
out  the  old  user  program,  and  UNDUMP,  which  brings  in  the  new  one. 

RELOC  routines  frequently  reference  the  Program  Status  Table.   In 
this  connection,  there  is  one  particular  problem  which  must  be  solved. 
This  problem  is  how  to  make  the  required  change  to  a  user  program  PST 
entry  after  an  assembly  or  compilation,  before  execution  of  the  object 
code  produced.   While  a  user  is  employing  a  language  processor,  the  PST 
entry  refers  to  his  copy  of  that  processor;  when  the  assembly  or  compila- 
tion is  complete,  however,  he  is  finished  with  the  processor,  and  his  PST 
entry  must  be  changed  to  reflect  the  object  program.   How  and  when  this 
is  to  be  accomplished  is  a  system  problem  which  must  be  resolved.   One 
solution  would  be  to  add  at  the  end  of  each  assembler  and  compiler  a  short 
routine  which  will  investigate  its  binary  output  medium,  on  which  is  the 
object  program.   The  program's  length,  and  starting  and  transfer  addresses 
could  be  located  there,  and  with  these  known,  the  required  PST  entry  could 
be  created. 

One  final  matter  to  be  determined  at  the  system  level  is  design  of 
the  loading  programs.   Each  of  the  operating  systems  being  furnished  with 
the  D.C.L.  SDS  930  already  includes  one  or  more  of  these.   Tape  MONARCH, 
for  example,  incorporates  two  loaders.   One  loads  binary  object  programs, 
including  the  output  of  the  SYMBOL  and  META-SYMBOL  assemblers;  the  other 
handles  previously  compiled  FORTRAN  programs.   Both  are  designed  to  accept 
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Figure    21.      D.C.L.    relocation  overall 
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Figure  22.   Swap  out  of  old  user 
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Figure    23.      Swap   out   of   old   user    (continued) 
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Figure    24.      Loading  of    new  user 
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Figure   25.      Loading  of   new  user    (continued) 
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38  "39 
files  written  in  the  relocatable  standard  binary  language  of  SDS.   ' 

References  to  programmed  operator  and  FORTRAN  library  routines  are  satis- 
fied by  attachment  of  copies  at  load  time.   The  Disc  MONARCH/Real-Time 
MONITOR  loaders,  when  written,  will  doubtless  have  similar  features.   In 
the  D.C.L.,  all  the  features  mentioned  will  be  necessary,  especially  for 
the  initial  loading  of  a  program  previously  compiled  or  assembled  by  one 
of  the  standard  language  processors.   After  that,  there  are  no  further 
external  references  to  be  handled,  and  in  the  proposed  system,  no  re- 
quirement for  relocatability  within  core  memory.   A  simple,  absolute 
loader  will  handle  the  movement  of  core  images  in  and  out  of  main  memory 
during  program  exchange  subsequent  to  the  first  one.   It  will  probably 
not  be  feasible  to  hold  permanently  resident  in  core  memory  a  powerful 

relocating  loader,  because  of  its  size;  the  binary  object  program  loader 

40 
of  Tape  MONARCH,  for  example,  occupies  about  1480-j^q  words.    Thus  the 

D.C.L.  system  should  anticipate  the  use  of  a  short,  resident,  absolute 
loading  program  whenever  possible,  calling  upon  a  non-resident  relocating 
loader  only  when  its  special  features  are  required. 

There  are  many  problems  to  be  solved  before  the  Digital  Control  Lab- 
oratory will  have  a  functioning  time-shared  computing  system.   If,  however, 
job  swapping  is  chosen  as  the  method  of  program  relocation,  it  is  believed 
that  a  reasonable  start  has  been  provided  for  its  implementation  in  this 
system. 


38SDS  MONARCH  Reference  Manual,  900  Series/9  300  Computers,  p.  59. 

39SDS  SYMBOL  and  META-SYMBOL  Reference  Manual  (SDS  Publication 
900506E,  October,  1966),  p.  66. 
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SDS  MONARCH  Technical  Manual,  900  Series/9300  Computers,  p.  38, 
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